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Method 

Field of the Invention 

The present invention relates to methods of improving the safety of retroviral 
5 vectors capable of delivering therapeutic genes for use in gene therapy, and to 
novel nucleotide sequences for use in such methods. 

Background to the Invention 

10 Retroviral vectors are now widely used as vehicles to deliver genes into cells. 
Their popularity stems from the fact that they are easy to produce and mediate 
stable integration of the gene that they carry into the genome of the target cell. 
This enables long-term expression of the delivered gene (1). 

1 5 There has been considerable interest, for some time, in the development of retroviral 
vector systems based on lentiviruses. Lentiviruses are a small subgroup of complex 
retroviruses. They contain, in addition to the common retroviral genes (gag^pol and 
emi), genes which enable them to regulate their Ufe cycle and to infect non-dividing 
cells (2). Vector systems based thereon are therefore of interest because of their 

20 potential use in the transfer of a gene of interest to non-dividing cells such as 
neurones. In addition, lentiviral vectors enable very stable long-term expression of 
the gene of interest. This has been shown to be at least three months for 
transduced rat neuronal cells while MLV based vectors were only able to express 
the gene of interest for six weeks. 

25 

The most commonly used lentivims is the Human Immunodeficiency Virus (HIV), 
the etiologic agent of AIDS (acquired immune deficiency syndrome). HIV-based 
vectors have been shown to efficiently transduce non-diving cells (3) and can be 
used, for example, to target and-HIV therapeutic genes to HIV susceptible cells. 

30 

1 



wo 01/79518 



PCT/GBOl/01784 



However, HIV vectors have a ninnber of significant disadvantages that may limit 
their therapeutic application to certain diseases. In particular, HIV-1 is a human 
pathogen carrying potentially oncogenic proteins and sequences. There is the risk 
that introduction of vector particles produced in packaging cells which express 
5 HIV gag-pol will introduce these proteins into the patient leading to 
seroconversion. 

Emphasis has therefore been placed on the safety of these vectors. One strategy 
looks at the design of production systems for retroviral vectors, A retrovirus 
vector system basically consists of two elements, a packaging cell line and a 
vector genome. The simplest packaging line consists of a provirus in vMch the \|/ 
sequence (a determinant of RNA packaguxg reporting in HIV as lyuig between 
U5 and gag) has been deleted. When stably transfected into a cell, virus particles 
containing reverse transcriptase will be produced but virion RNA will not 
become packaged within these particles. The complementing component in a 
retrovirus vector system is the genome vector itself. The genome vector needs to 
contain a packaging sequence but much of the structural coding regions can be 
deleted. Often a selectable marker gene, or other nucleotide sequence of interest, 
is incorporated into the vector. Vector stocks of the packaging line can then be 
used to infect target cells. Provided the cell is successfully infected by the viral 
particle, the genome vector sequence will be reverse transcribed and integrated by 
the retroviral machinery. However, infection is an end process so no further 
replication or spread of the vector should occur. 

25 As indicated above, however, problems are encountered in the design of safe and 
effective retroviral vectors. These include the possibility that recombination 
between the packagmg vector and the packaging sequence can lead to the 
generation of wild type replication competent virus. Consequentiy efforts have 
been directed at improving the safety of packaging cell constructs. 

30 
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In second generation packaging cell lines, in addition to deletion of the packaging 
sequence, the 3' LTR was also deleted so that two recombinations are necessary 
to generate a wild type virus. 

5 In third generation packaging lines the gag-pol genes and env gene are placed on 
separate constructs that are sequentially introduced into the packagmg cells to 
prevent recombination during transfection. 

With regard to the packaging signal, EP 0 368 882A (Sodroski) discloses that in 
10 HIV it corresponds to the region between the 5' major splice donor and the gag 
initiation codon, and particularly corresponds to a segment just downstream of 
the 5' major splice donor, and about 14 bases upstream of the gag mitiation 
codon. It is this region which Sodroski teaches should be deleted from the gag- 
pol cassette. W097/12622 (Verma) describes that in HIV-1 a 39 bp internal 
15 deletion in the \|/ sequence can be made between the 5' splice donor site and the 
starting codon of the gag gene. 

Codon wobbling can be used to reduce recombination frequency while 
maintaining the primary protein sequence of the constructs, c.f. (4) in which the 

20 region of overlap between the gag-pol and env expression constructs was reduced 
to 61 bp extending over the common region between pol and env which are in 
different reading frames. Transversion mutations were introduced into the final 
20 codons of poU retaining the integrity of the coding region while reducing the 
homology with env to 55% in the overlap region. Similarly wobble mutations 

25 were introduced into the 3' of em and all sequences downstream of the em stop 
codon were deleted. 

Efficient vectors usually contain part of gag on the genome vector to increase 
virion titre. Unlike the packaging sequence which can be in any position within a 

3 
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sequence to effect packaging, the gag sequence must be in its native position 
adjacent to \|/ to have any effect. 

It will be appreciated that whilst significant improvements in packaging cell and 
5 vector design have been made there is still scope for further refinement of current 
packaging lines. 

Summary of the Invention 

10 It is tiierefore an aim of the invention to provide retroviral particles, in particular 
lentiviral particles, and particularly those which carry nucleotide constructs 
encoding therapeutic proteins, that have improved safety over the corresponding 
wild type vkal particle. In our W099/41397 we describe codon optimisation of 
the gag-pol genes as a means of overcoming the Rev/RRE requirement for export 

15 and to enhance RNA stability. We have now found however that the codon 
optimised gag-pol sequence overcomes potential recombination problems with 
vector genomes which carry part of a gag sequence with the aim of increasing 
titre. This strategy also avoids the need to use gag regions from different viruses 
in the packaging and vector genome constructs. 

20 Another significant advantage provided by the invention is that the codon 
optimisation disrupts RNA secondary structures, such as the packaging signal, 
thus rendering the gag-pol mRNA non-packagable. Thus, the present invention 
allows retroviral sequence upstream of the gag initiation codon to be retained, in 
contrast to Sodroski and Veima, without sigaificantiy compromising safety. 

25 

Statements of the Livention 

Accordingly in one aspect the present invention provides use of a nucleotide 
sequence coding for retroviral gag and pol proteins, capable of assembly of a 
retroviral vector genome into a retroviral particle in a producer cell, to generate a 

4 
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replication defective retrovirus in a target cell, wherein the nucleotide sequence is 
codon optimised for expression in the producer cell. 

Thus in one embodiment the present invention provides use of a nucleotide 
sequence coding for retroviral gag and pol proteins capable of assembly of a 
5 retroviral vector genome into a retroviral particle in a producer cell to reduce or 
prevent packaging of the retroviral vector genome in a target cell, \^erein the 
nucleotide sequence is codon optimised for expression in the producer cell. 

In another embodiment the present invention provides use of a nucleotide 
10 sequence coding for retroviral gag and pol proteins, capable of assembly of a 
retroviral vector genome comprising at least part of a gag nucleotide sequence 
into a retroviral particle in a producer cell, to reduce or prevent recombination 
between said nucleotide sequence coding for retroviral gag and pol proteins and 
the at least part of a gag nucleotide sequence, wherein the nucleotide sequence 
1 S coding for retroviral gag and pol proteins is codon optimised for expression in flie 
producer cell. 

Put another way the present invention provides a method of producing a 
replication defective retrovirus comprising transfecting a producer cell with the 
20 following: 

i) a retroviral genome; 

ii) a nucleotide sequence coding for retroviral gag and pol proteins; 
and 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by the nucleotide sequence of (ii); 

characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer cell. 

Thus in one embodiment the present invention provides a method of reducing or 
preventing packaging of a retroviral genome in a target cell comprising the st&ps 
of: 
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a« transfecting a producer cell mth the following to produce 
retroviral particles: 

i) a retroviral genome; 

ii) a nucleotide seqxience coding for retroviral gag and pol proteins; 

and 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by one or more of the nucleotide 
sequences of (ii); and 

b. transfecting a target cell with retroviral particles of step (a); 
characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer cell. 

In another embodiment the present invention provides a method to reduce or 
prevent recombination between a retroviral vector genome and a nucleotide 
sequence encoding a viral polypeptide required for the assembly of the viral 
genome into retroviral particles comprising transfecting a producer cell with &e 
following: 

(i) a retroviral genome comprising at least part of a gag nucleotide 
sequence; 

(ii) a nucleotide sequence coding for retroviral gag and pol proteins; 

and 

(iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by the nucleotide sequence of (ii); 

characterised in that the nucleotide sequence coding for retroviral g£^ and pol 
proteins is codon optimised for expression in the producer cell. 

We also provide novel codon optimised sequences as shown in SEQ ID NOS: 15 
and 16 and wdiich may be used m the present invention. However, it will be 
appreciated that any convenient codon optimised gag-pol sequence may be 
employed in the invention. 
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The present invention further provides a retroviral particle produced using the 
sequences of the present invention, and production methods for so doing. 

5 The present invration also provides a pharmaceutical composition comprising a 
viral particle according to the present invention, together with a pharmaceutically 
acceptable diluent or carrier. 

By ^'reducing" we mean that the chance of an event occurring is reduced 
10 • compared to a comparable population havingg the wild-^e gag-pol sequence. 
Within a population the chance of an event occurring may be prevented for an 
individual retrovirus vector or particle. 

Detailed Description of the Invention 

15 

Various preferred features and embodiments of the present invention will now be 
described by way of non-limiting example. 

The present invention employs the concept of codon optimisation. 

20 

Codon optimisation has previously been described in our W099/41397 as a 
means of overcoming the Rev/RRE reqiairement for export and to enhance RNA 
stability. The alterations to the coding sequences for the viral^ components 
improve the sequences for codon usage in the mammalian cells or other cells 
25 which are to act as the producer cells for retroviral vector particle production. 
This improvement in codon usage is referred to as "codon optimisation". Many 
viruses, including HIV and other lentiviruses, use a large number of rare codons 
and by changing these to correspond to commonly used mammalian codons, 
increased e^qpression of the packaging components in mammalian producer cells 
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can be achieved. Codon usage tables are knowa in fhe art for manojnalian cells, 
as well as for a variety of other organisms. 

By virtue of alterations in their sequences, the nucleotide sequences encoding the 
5 packaging components of the viral particles required for assembly of viral 
particles in the producer cells/packagmg cells have RNA instability sequences 
(INS) eliminated from them. At the same time, the amino acid coding sequence 
for the packaging components is retained so that the viral components encoded 
by the sequences remain the same, or at least sufficiently similar that the function 
10 of the packaging components is not compromised. 

The term **viral polypeptide required for the assembly of viral particles" means a 
polypeptide normally encoded by the viral genome to be packaged into viral 
particles, in the absence of which the viral genome cannot be padcaged. For 
15 example, in the context of retroviruses such polypeptides would include gag-pol ' 
and env. The term "packaging component" is also included wilhin this definition. 

As discussed in our W099/32646, the sequence requirements for packaging HIV 
vector genomes are complex. The HTV-l packaging signal encompasses the 

20 splice donor site and contains a portion of the 5 '-untranslated region of the gag 
gene, which has a putative secondary structure containing 4 short stem-loops. 
However, additional sequences elsewhere in the genome are also known to be 
important for efficient encapsidation of HIV. For example, the first 350 bps of 
the gag protein coding sequence may contribute to efficient packaging. Thus, for 

25 construction of HIV-1 vectors capable of expressing heterologous genes, a 
packaging signal extending to 350 bps of the gag protein-coding region has been 
used on the vector genome. We have now found that codon optimisation of the 
gag coding region on the packaging vector, at least in the region into which the 
packaging signal extends, also has the effect of disrupting packaging of the vector 
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genome. Thus codon optimisation is a novel method of obtaining a repUcation 
defective viral particle. 

Also as disclosed in W099/32646, the structure of the packaging signal in equine 
5 lentiviruses is different from that of HIV. Instead of a short sequence of 4 stem 
loops together with a packaging signal extending to 350 bps of the gag protein- 
coding region, we have found that in equine lentiviruses the packaging signal 
may not extend as far into the gag protein-coding region as may have been 
thought. 

10 

in one embodiment only codons relatiag to the packaging signal are codon 
optimised. Thus, in one embodiment, codon optindsation extends to at least the 
first 3S0 bps of the gag protein coding region. In equine lentiviruses, at least, 
codon optimisation extends to at least nucleotide 300 of the gag coding region, 
15 more preferably to at least nucleotide 150 of the gag coding region. Although not 
optimal, codon optimisation could extend to, say, only the first 109 nucleotides of 
the gag coding region. It may also be possible for codon optimisation to extend 
to only the first codon of the gag coding region. 

20 However, in a much more preferred and practical embodiment, the sequences are 
codon optimised in their entirety, with the exception of the sequence 
encompassing the frameshift site. 

The gag'pol gene comprises two overlapping reading frames encoding gag and 
25 pol proteins respectively. The expression of both proteins depends on a 
frameshift during translation. This frameshift occurs as a result of ribosome 
"slippage" during translation. This slippage is thought to be caused at least in 
part by ribosome-stalling RNA secondary stmctures. Such secondary structures 
exist downstream of the frameshift site in the gag-pol gene. For HIV, the region 
30 of overlap extends from nucleotide 1222 downstream of the beginning of gag 
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(wherein nucleotide 1 is the A of the gag ATG) to the end of gag (nt 1503). 
Conseqnentiy, a 281 bp fragment spanning the frameshift site and the 
overlappmg region of the two reading frames is preferably not codon optimised. 
Retaining this fragment will enable more efficient expression of the gag-pol 
5 proteins. 

For EIAV the beginning of the overlap has been taken to be nt 1262 (where 
nucleotide 1 is the A of the gag ATG). The end of the overlap is at 1461 bp. In 
order to ensure that the frameshift site and the gag, gag-pol overlap the wild type 
10 sequence has been retained from nt 1 1 56 to 1465. This can be seen in Figure 9b. 

Derivations from optimal codon usage may be made, for example, in order to 
acconmiodate convenient restriction sites, and conservative amino acid changes 
may be introduced into the gag-pal proteins. 

15 

In a highly preferred embodiment, codon optimisation was based on Hghtly 
expressed mammalian genes. The third and sometimes the second and third base 
may be changed. An example of a codon usage table is given in Figure 3b. 

20 Due to the degenerate nature of the Genetic Code, it will be appreciated that 
nimierous gag-pol sequences can be achieved by a skilled worker. Also there are 
many retroviral variants described and which can be used as a starting point for 
generating a codon optimised gag-pol sequence. Lentiviral genomes can be quite 
variable. For example there are many quasi-species of HIV-1 which are still 

25 ftinctional. This is also the case for EIAV. These variants may be used to 
enhance particular parts of the transduction process. Examples of HIV-1 
variants may be found at http://hiv-web.lanl.|gov . Details of EIAV clones may be 
foimd at the NCBI database: http://www.ncbi.nlm.nih.gov . 
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The strategy for codon optimised gag-pol sequences can be used in relation to 
any retrovirus. This would apply to all the lentiviruses, including EIAV, FIV, 
BIV, CAEV, VMR, SIV, fflV-1 and HIV-2. In addition this method could be 
used to increase expression of genes from HTLV-1, HTLV-2, HFV, HSRV and 
5 human endogenous retroviruses (HERV). 

As codon optimisation may result in disruption of RNA secondary structures such 
as the packaging signal, it will be appreciated that any endogenous packaging 
signal upstream of the gag initiation codon could be retained without 
10 compromising safety. 

An additional advantage of codon optimising packaging components is that this 
can increase gene expression. In particular, it can render gag-pol expression Rev 
independent In order to enable the use of anti-rev or RRE factors m the retroviral 

15 vector, however, it would be necessary to render the viral vector generation 
system totally Rev/RRE independent (5). Thus, the genome also needs to be 
modified. This is achieved by optimising vector genome components. 
Advantageously, these modifications also lead to the production of a safer system 
absent of all accessory proteins both in the producer and in the transduced cell, 

20 and are described below. 

As described above, the packaging components for a retroviral vector include 
expression products of gag, pol and em genes. In addition, efficient pack^ing 
depends on a short sequence of 4 stem loops followed by a partial sequence from 

25 gag and em (the **packaging signal'^. Thus, inclusion of a deleted gag sequence 
in the retroviral vector genome (in addition to the fiall gag sequence on the 
• packaging construct) will optimise vector titre. To date efficient packaging has 
been reported to require from 255 to 360 nucleotides of gag in vectors that still 
retain em sequences, or about 40 nucleotides of gag in a particular combination 

30 of splice donor mutation, gag and em deletions. We have surprisingly found that 
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a deletion of up to 360 nucleotides in gag leads to an increase in vector titre. 
Further deletions resulted in lower titres. Additional mutations at the major 
splice donor site upstream of gag were found to disrupt packaging signal 
secondary structure and therefore lead to decreased vector titre. Thus, preferably, 
the retroviral vector genome includes a gag sequence from which up to 360 
nucleotides have been removed. 

We therefore allow the preparation of a so-called "minimal"* system in which all 
of the accessory genes may be removed In HTV these accessory genes are vpr, 
vif, tat, nef ypu and rev. Similarly, in other lentiviruses the analogous accessory 
genes normally present in the lentivirus may be removed. For the avoidaiice of 
doubt, however, we would mention that th epresent invention also extends to 
systems, particles and vectors in which one or more of these accessory genes is 
present and in any combination. 

The term *Viral vector" refers to a nucleotide construct comprising a viral 
genome capable of being transcribed in a host cell, which genome comprises 
sufficient viral genetic information to allow packaging of the viral RNA genome, 
in the presence of packaging components, into a viral particle capable of infecting 
a target cell. Infection of the target cell includes reverse transcription and 
integration into the target cell genome, where appropriate for particular viruses. 
The viral vector in use typically carries heterologous coding sequences 
(nucleotides of interest or '"NOIs") which are to be delivered by the vector to the 
target cell, for example a first nucleotide sequence encoding a ribozyme. By 
"replication defective" we mean that a viral vector is incapable of independent 
replication to produce infectious viral particles within the final target cell. 

The term " viral vector system" is intended to mean a kit of parts which can be 
used when combined with other necessary components for viral particle 
production to produce viral particles in host cells. For example, an NOI may 
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typically be present in a plasmid vector construct suitable for cloning the NOI 
into a viral genome vector construct. When combined in a kit with a further 
nucleotide sequence, which will also typically be present in a separate plasmid 
vector construct, the resulting combination of plasmid containing the NOI and 

5 plasmid containing the fiorther nucleotide sequence comprises the essential 
elements of the invention. Such a kit may then be used by the skilled person in 
the production of suitable viral vector genome constructs which when transfected 
into a host cell together with the plasmid containing the further nucleotide 
sequence, and optionally nucleic acid constructs encoding other components 

10 required for viral assembly, will lead to the production of infectious viral 
particles. 

Alternatively, the further nucleotide sequence may be stably present within a 
packaging cell line that is included in the kit. 

15 

The kit may include the other components needed to produce vural particles, such 
as host cells and other plasmids encoding essential viral polypeptides required for 
viral assembly. By way of example, the kit may contain (i) a plasmid containing 
an NOI and (ii) a plasmid containing a further nucleotide sequence encoding a 

20 modified retroviral gag-pol construct which has been codon optimised for 
expression in a producer of choice. Optional components would then be (a) a 
retroviral genome construct with svdtable restriction enzyme recognition sites for 
cloning the NOI into the viral genome, optionally with at least a partial gag 
sequence; (b) a plasmid encoding a VSV-G env protein. Alternatively, 

25 nucleotide sequence encoding viral polypeptides required for assembly of viral 
particles may be provided in the kit as packaging cell lines comprising the 
nucleotide sequences, for example a VSV-G expressing cell line. 



13 



wo 01/79518 



PCT/GBOl/01784 



The term 'Sdral vector production system'* refers to the viral vector system 
described above wherein the NOI has akeady been inserted into a suitable viral 
vector genome. 

5 In the present invention, several terms are used interchangeably. Thus, '^d^ion'^ 
"virus", *Sdral particle", "retroviral particle", ^'retrovirus", and 'Vector particle" 
mean virus and virus-like particles that are capable of introducing a nucleic acid 
into a cell through a viral-like entry mechanism. Such vector particles can, under 
certain circumstances, mediate the transfer of NOIs into the cells they infect A 

10 retrovirus is capable of reverse transcribing its genetic material into DNA and 
incorporating this genetic material into a target cell's DNA upon transduction. 
Such cells are designated herein as 'target cells''. 

As used herein the term 'target cell" sunply refers to a cell which the regulated 
15 retroviral vector of the present invention, whether native or targeted, is capable of 
infecting or transducing. 

A lentiviral vector particle according to the invention will be capable of 
transducing cells which are slowly-dividing, and which non-lentiviruses such as 
20 MLV would not be able to efficiently transduce. Slowly-dividing cells divide 
once in about every three to four days including certain tumour cells. Although 
tumours contain rapidly dividing cells, some tumour cells especially tdose in the 
centre of the tumour, divide infrequentiy. 

25 Alternatively the target cell may be a growth-arrested cell capable of undergoing 
cell division such as a cell in a central portion of a tumour mass or a stem cell 
such as a haematopoietic stem cell or a CD34-positive cell. 

As a further alternative, the target cell may be a precursor of a differentiated cell 
30 such as a monocyte precursor, a CD33-positive cell, or a myeloid precursor. 
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As a further alternative, the target cell may be a differentiated cell ^such as a 
neuron, astrocyte, glial cell, microglial cell, macrophage, monocyte, epithelial 
cell, endothelial cell, hepatocyte, spennatocyte, spermatid or spermatozoa. 

5 

Target cells may be transduced either in vitro after isolation from a human 
individual or may be transduced directly in vivo. 

Viral vectors according to the invention are retroviral vectors, in particular 
10 lentiviral vectors such as HIV and EIAV vectors. The retroviral vector of the 
present invention may be derived from or may be derivable from any suitable 
retrovirus. A large number of different retroviruses have been identified. 
Examples include: murine leukemia virus (MLV), human immxmodeiBciency 
vmis (HIV), simian immunodeficiency virus, human T-cell leukemia virus 
15 (HTLV). equine infectious anaemia virus (EIAV), mouse mammary tumour 
virus (MMTV), Rous sarcoma virus (R5V), Fujiuami sarcoma virus (FuSV), 
Moloney murine leukemia virus (Mo-MLV), FBR murine osteosarcoma virus 
(FBR MSV), Moloney murine sarcoma virus (Mo-MSV), Abelson mxuine 
leukemia virus (A-MLV), Avian myelocytomatosis virus-29 (MC29), and Avian 
20 erythroblastosis virus (AEV). A detailed list of retroviruses may be found in 
Coffm et al, 1997, "Retroviruses", Cold Spring Harbour Laboratory Press Eds: 
JM Coffin, SM Hughes, HE Vannus pp 758-763. 

The term "derivable" is used in its normal sense as meaning a nucleotide sequence 
25 such as an LTR or a part tiioreof which need not necessarily be obtained fcom a 
vector such as a retroviral vector but instead could be derived therefi:om. By way of 
example, the sequence may be prepared synthetically or by use of recombinant 
DNA techniques. 
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Details on tiie genomic structure of some retroviruses may be found in the art 
By way of example, details on HTV and Mo-MLV may be found from theNCBI 
Genbank (Genome Accession Nos. AF033819 and AF033811, respectively). 
Details of HTV variants may also be fonnd at http:/yhiv-web.lanLgov , Details of 
5 EIAV variants may be found tbrough httD://www>ncbi,nlm.mh. gov . 

The lentivirus group can be split even further into ''primate" and "non-primate". 
Examples of primate Antiviruses include human immunodeficiency virus (HIV), 
the causative agent of human auto-immunodeficiency syndrome (AIDS), and 
10 simian immunodeficiency virus (SIV). The non-primate lentiviral group includes 
the prototype "slow virus" visna/maedi virus (VMV), as well as the related 
caprine arthritis-encephalitis virus (CAEV), equine infectious anaemia virus 
(EIAV) and the more recently described feline immunodeficiency virus (FIV) 
and bovine immunodeficiency virus (BIV). 

15 

The basic structure of a retrovirus genome is a 5' LTR and a 3' LTR, between or 
within which are located a packaging signal to enable the genome to be packaged, 
a primer binding site, integration sites to enable iutegration into a host cell 
genome and gag, pol and env genes encoding the packaging components - these 
20 are polypeptides required for the assembly of viral particles. More complex 
retroviruses have additional features, such as rev and RRE sequences in HIV, 
which enable the efficient export of KNA transcripts of the integrated provirus 
from the nucleus to the cytoplasm of an infected target cell. 

25 In the provirus, these genes are flanked at both ends by regions called long 
terminal repeats (LTRs). The LTRs are responsible for proviral integration, and 
transcription. LTRs also serve as enhancer-promoter sequences and can control 
the expression of the viral genes. Encapsidation of the retroviral RNAs occurs by 
virtue of a psi sequence, which it has been disclosed in respect of HTV, at least, is 

30 located at the 5' end of the viral genome. 
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The LTRs themselves are identical sequences that can be divided into three 
elements, which are called U3, R and U5. U3 is derived from the sequence 
unique to the 3* end of the RNA. R is derived from a sequence repeated at both 
5 ends of the RNA and U5 is derived from the sequence unique to the 5' end of the 
RNA. The sizes of the three elements can vary considerably among different 
retroviruses. 

In a defective retroviral vector genome gag, pol and em may be absent or not 
10 frmctional. The R regions at both ends of the RNA are repeated sequences. US 
and U3 represent unique sequences at the and 3' ends of the RNA genome 
respectively. 

As discussed above, m a typical retroviral vector for use in gene therapy, at least 
IS part of one or more of the gag, pol and em protein coding regions essential for 
rq)lication may be removed from the viral vector. This makes the retroviral 
vector replication-defective. The removed portions may even be replaced by a 
nucleotide sequence of interest (NOI), as in the present invention, to generate a 
virus capable of integrating its genome into a host genome but wherein the 
20 modified viral genome is unable to propagate itself due to a lack of structural 
proteins. When integrated in the host genome, expression of the NOI occurs - 
resulting in, for example, a therapeutic and/or a diagnostic effect. Thus, the 
transfer of an NOI into ^ site of interest is typically achieved by: integrating the 
NOI into the recombinant viral vector; packaging the modified viral vector into a 
25 virion coat; and allowing transduction of a site of interest - such as a targeted cell 
or a targeted cell population. 

A minimal retroviral genome for use in the present invention may therefore 
comprise (5') R - US - one or more NOIs - U3-R (3'), However, the plasmid 
30 vector used to produce the retroviral genome within a host cell/packaging cell 
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will also include transcriptional regulatory control sequences operably linked to 
the retroviral genome to direct transcription of the genome in a host 
cell/packaging cell. These regulatory sequences may be the natural sequences 
associated with the transcribed retroviral sequence, i.e. tiie 5' U3 region, or they 
5 may be a heterologous promoter such as anoflier viral promoter, for example the 
CMV promoter. 

Some retroviral genomes require additional sequences for eflBcient virus 
production. For example, in the case of HIV, rev and RRE sequence should be 
10 included. However, we have found that the requirement for rev and RIIE can be 
reduced or eliminated by codon optimisation. As expression of the codon 
optimised gag-pol is REV independent, RRE can be removed from the gag-pol 
expression cassette, thus removing any potential for recombination with any RRE 
contained on the vector genome. 

15 

Once the retroviral vector NOIs sequences need to be expressed. In a retrovirus, 
the promoter is located in the 5' LTR U3 region of tiie provirus. In retroviral 
vectors, the promoter driving expression of a therapeutic gene may be the native 
retroviral promoter in the 5' U3 region, or an alternative promoter engineered 
20 into the vector. The alternative promoter may physically replace the 5' U3 
promoter native to the retrovirus, or it may be incorporated at a different place 
within the vector genome such as between the LTRs. 

Thus, the NOI will also be operably linked to a transcriptional regulatory control 
25 sequence to allow transcription of the first nucleotide sequence to occur in the 
target cell. The control sequence will typically be active m mammalian cells. The 
control sequence may, for example, be a viral promoter such as the natural viral 
promoter or a CMV promoter or it may be a mammalian promoter. It is 
particularly preferred to xise a promoter that is preferentially active in a particular 
30 cell type or tissue type in which the vnus to be treated primarily infects. Thus, in 
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one embodiment, a tissue-specific regulatory sequences may be used. The 
regulatory control sequences driving expression of the one or more first 
nucleotide sequences may be constitutive or regulated promoters. 

5 The term "operably linked" denotes a relationship between a regulatory region 
(typically a promoter element, but may include an enhancer element) and the 
coding region of a gene, whereby the transcription of the coding region is under 
the control of the regulatory region. 

10 As used herein, the term "enhanced' includes a DNA sequence which binds other 
protein components of the transcription initiation complex and thus facilitates the 
initiation of transcription directed by its associated promoter. 

In one preferred embodiment of the present invention, the enhancer is an 
15 ischaemic like response element (ILRE). 

The term "ischaemia like response element" - otherwise written as ILRE - 
includes an element that is responsive to or is active under conditions of 
ischaemia or conditions that are like ischaemia or are caused by ischaemia. By 
20 way of example, conditions that are like ischaemia or are caused by ischaemia 
include hypoxia and/or low glucose concentration(s). 

The term "liypoxia" means a condition under which a particular organ or tissue 
receives an inadequate supply of oxygen. 

25 

Ischaemia can be an insufficient supply of blood to a specific organ or tissue. A 
consequence of decreased blood supply is an inadeqmte supply of oxygen to the 
organ or tissue (hypoxia). Prolonged hypoxia may resxilt in injury to the affected 
organ or tissue. 

30 
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A preferred ILRE is a hypoxia response element (HRE). 

In one preferred aspect of the present invention, there is hypoxia or ischaemia 
regulatable expression of the retroviral vector components. In this regard, 
5 hypoxia is a powerful regulator of gene expression in a wide range of different 
cell types and acts by the induction of the activity of hypoxia-inducible 
transcription factors such as hypoxia inducible factor-1 (HIF-1; 6), which bind to 
cognate DNA recognition sites, the hypoxia-responsive elements (HREs) on 
various gene promoters. Dachs et al (7) have used a multimeric form of the HRE 
10 from the mouse phosphoglycerate kinase-1 (PGK-1) gene (8) to control 
e^qpression of both marker and therapeutic genes by hunaan fibrosarcoma cells in 
response to hypoxia in vitro and within solid tumours in vivo (7 ibid). 

Hypoxia response enhancer elements (HREEs) have also been foimd in 
15 association with a number of genes including the erythropoietin (EPO) gene (9; 
10). Other HREEs have been isolated from regulatory regions of both the muscle 
glycolytic enzyme pyruvate kinase (PKM) gene (11), the human muscle-specific 
P-enolase gene (EN03; 12) and tiie endo1hehn-l (ET-1) gene (13). 

20 Preferably the HRE of the present invention is selected from, for example, the 
erythropoietin HRE element (HREEl), muscle pyruvate kinase (PKM), HRE 
element, phosphoglycerate kinase (PGK) HRE, B-enolase (enolase 3; EN03) 
HRE element, endothelin-1 (ET-1)HRE element and metallothionein n (MHI) 
HRE element 

25 

Preferably the ILRE is used in combination with a transcriptional regulatory 
element, such as a promoter, which transcriptional regulatory element is 
preferably active in one or more selected cell type(s), preferably being only active 
in one cell type. 

30 
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As outlined above, tiiis combination aspect of the present inventioB is called a 
responsive element. 

Preferably the responsive element comprises at least the ILRE as herein defined. 

5 

Non-limiting examples of such a responsive element are presented as OBHREl 
and XiaMac. Another non-limiting example includes the ILRE in use in 
conjunction with an MLV promoter aud^or a tissue restricted ischaemic 
responsive promoter. These responsive elements are disclosed in W099/15684. 

10 

Other examples of suitable tissue restricted promoters/enhancers are those which 
are highly active in tumoxir cells such as a promoter/enhancer £rom a MUCl 
gene, a CEA gene or a 5T4 antigen gene. The alpha-fetoprotein (AFP) promoter 
is also a tumour-specific promoter. One preferred promoter-enhancer 
IS combination is a human cytomegalovirus (hCMV) major immediate early ^MDDE) 
promoter/enhancer combination. 

The term "promoter" is used in the normal sense of the art, e.g. an RNA 
polymerase binding site. 

20 

The promoter may be located in the retroviral 5' LTR to control the expression of 
a cDNA encoding an NOI, and/or gag-pol proteins. 

Preferably the NOI and/or gag-pol proteins are capable of being expressed firom 
25 the retrovirus genome such as from endogenous retroviral promoters in the long 
tenninal repeat (LTR). 

Preferably the NOI and/or gag-pol proteins are expressed firom a heterologous 
promoter to which the heterologous gene or sequence, and/or codon optimised 
30 gag-pol sequence is operably linked. 
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Alternatively, the promoter may be an internal promoter. 
Preferably the NOI is expressed from an mtemal promoter. 

5 

Vectors containing internal promoters have also been widely used to express 
multiple genes. An internal promoter makes it possible to exploit 
promoter/enhancer combinations other than those found in the viral LTR for 
driving gene expression. . Multiple internal promoters can be included in a 
10 retroviral vector and it has proved possible to express at least three different 
cDNAs each from its own promoter (14). Intemal ribosomal entry site (IRES) 
elements have also been iised to allow translation of multiple coding regions from 
either a single mRNA or from ftision protems that can then be expressed from an 
open reading firame.' 

15 

The promoter of the present invention may be constitutively efficient, or may be 
tissue or temporally restricted in their activity. 

Preferably the promoter is a constitutive promoter such as CMV. 

20 

Preferably the promoters of the present invention are tissue specific. That is, they 
are capable of driving transcription of a NOI or NOI(s) in one tissue while 
remaining largely "silent" in other tissue types. 

25 The term "tissue specific" means a promoter which is not restricted in activity to 
a single tissue type but which nevertheless shows selectivity in that they may be 
active in one group of tissues and less active or silent in another group. 

The level of expression of an NOI or NOIs under the control of a particular 
30 promoter may be modulated by manipulating the promoter region. For example. 
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different domains within a promoter region may possess different gene regulatory 
activities. The roles of these different regions are typically assessed using vector 
constructs having different variants of the promoter with specific regions deleted 
(that is, deletion analysis). This approach may be used to identify, for example, 
5 the smallest region capable of conferring tissue specificity or the smallest region 
conferring hypoxia sensitivity. 

A number of tissue specific promoters, described above, may be particularly 
advantageous in practising the present invention. In most instances, these 
10 promoters may be isolated as convenient restriction digestion fragments suitable 
for cloning in a selected vector. Alternatively, promoter firagments may be 
isolated using the polymerase chain reaction. Cloning of the amplified fragments 
may be facilitated by incorporating restriction sites at the 5' end of the primers. 

15 The NOI or NOIs may be under the e3q)ression control of an expression 
regulatory element, such as a promoter and enhancer. 

Preferably the ischaemic responsive promoter is a tissue restricted ischaemic 
responsive promoter. 

20 

Preferably the tissue restricted ischaemic responsive promoter is a macrophage 
specific promoter restricted by repression. 

Preferably the tissue restricted ischaemic responsive promoter is an endothelium 
25 specific promoter. 

Preferably the regulated retroviral vector of the present invention is an ILRE 
regulated retroviral vector. 
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Preferably the regulated retroviral vector of the present invention is an ILRE 
regulated lentiviral vector. 

Preferably tiie regulated retroviral vector of the present invention is an 
5 autoregulated hypoxia responsive lentiviral vector. 

Preferably the regulated retroviral vector of the present invention is regulated by 
glucose concentration. 

10 For example, the glucose-regulated proteins (grp's) such as grp78 and grp94 are 
highly conserved proteins known to be induced by glucose deprivation (15). The 
grp 78 gene is expressed at low levels in most normal healthy tissues under the 
influence of basal level promoter elements but has at least two critical "stress 
mducible regulatory elements" upstream of the TATA element (15 ibid; 16). 

15 Attachment to a truncated 632 base pair sequence of the 5'end of the grp78 
promoter confers high inducibility to glucose deprivation on reporter genes in 
vitro (16 ibid). Furthermore, this promoter sequence in retroviral vectors was 
capable of driving a high level expression of a reporter gene in tumour cells in 
murine fibrosarcomas, particularly in central relatively ischaemic/fibrotic sites 

20 {16 ibid). 

Preferably the regulated retroviral vector of the present invention is a self- 
inactivating (SIN) vector. 

25 By way of example, self-inactivating retroviral vectors have been constructed by 
deleting the transcriptional enhancers or the enhancers and promoter in the U3 
region of the 3' LTR. After a round of vector reverse transcription and 
integration, these changes are copied into both the 5' and the 3' LTRs producing a 
transcriptionally inactive provirus (17; 18; 19; 20). However, any promoter(s) 

30 intemal to the LTRs in such vectors will still be transcriptionally active. This 
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Strategy has been employed to eliminate effects of the enhancers and promoters 
in the viral LTRs on transcription from internally placed genes. Such effects 
include increased transcription (21) or suppression of transcription (22). This 
strategy can also be used to eliminate downstream transcription from the 3' LTR 
5 into genomic DNA (23). This is of particular concern in human gene therapy 
where it is of critical importance to prevent the adventitious activation of an 
endogenous oncogene. 

As discussed above, replication-defective retroviral vectors are typically 
10 propagated, for example to prepare suitable titres of the retroviral vector for 
subsequent transduction, by using a combination of a packaging or helper cell 
line and the recombinant vector. That is to say, that the three packaging proteins 
can be provided in trans. 

IS In general a "packagmg cell line" contains one or more of the retroviral gag, pol 
and env genes. In the present invention it contains codon optimised gag-pol 
genes, and optionally an env gene. The packaging cell line produces the proteins 
required for packaging retroviral DNA but it cannot bring about encapsidation. 
Conventionally this has been achieved through lack of a psi region. However, 

20 when a recombinant vector carrying an NOT and a psi region is introduced into 
the packaging cell line, the helper proteins can package the /7.y/-positive 
recombiuant vector to produce the recombinant virus stock. This virus stock can 
be used to transduce cells to introduce the NOI into the genome of the target 
cells. Conventionally a psi packaging signal, called psi plus, has been used that 

25 contains additional sequences spanning from iq)stream of the spUce donor to 
downstream of the gag start codon (24) since this has been shown to increase 
viral titres. 

The recombinant virus whose genome lacks all genes required to make vhal 
30 proteins can tranduce only once and cannot propagate. These viral vectors which 
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are only capable of a single round of transduction of target cells are known as 
replication defective vectors. Hence, the NOI is introduced into the host/target 
cell genome without the generation of potentially harmful retrovirus. A summary 
of the available packagmg lines is presented in CofBn et al., 1997 (ibid). 

5 

The retroviral packaging cell line is preferably in the form of a transiently 
transfected cell line. Transient transfections may advantageously be used to 
measure levels of vector production when vectors are being developed. In this 
regard, transient transfection avoids the longer time required to generate stable 

10 vector-producing cell lines and may also be used if the vector or retroviral 
packaging components are toxic to cells. Components typically used to generate 
retroviral vectors include a plasmid encoding the gag-pol proteins, a plasmid 
encoding the env protein and a plasmid containing an NOI. Vector production 
involves transient transfection of one or more of these components into cells 

15 containing the other required components. If the vector encodes toxic genes or 
genes that interfere with the replication of the host cell, such as inhibitors of the 
cell cycle or genes that induce apotosis, it may be difficult to generate stable 
vector-producing cell lines, but transient transfection can be used to produce the 
vector before the cells die. Also, cell lines have been developed using transient 

20 transfection that produce vector titre levels that are comparable to the levels 
obtained from stable vector-producing cell lines (25). 

Producer cells/packaging cells can be of any suitable cell type. Producer cells are 
generally mammalian cells but can be, for example, insect cells. A producer cell 

25 may be a packaging cell containing the virus structural genes, normally integrated 
into its genome into which the regulated retroviral vectors of flie present 
invention are introduced. Alternatively the producer cell may be transfected with 
nucleic acid sequences encoding structural components, such as codon optimised 
gag-pol and env on one or more vectors such as plasmids, adenovirus vectors, 

30 herpes viral vectors or any method known to deliver functional DNA into target 
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cells. The vectors according to the present invention are then introduced into the 
packaging cell by the methods of the present invention. 

As used herein, the term **producer cell" or 'Vector producing cell" refers to a cell 
5 which contains all the elements necessary for production of regulated retroviral 
vector particles and regulated retroviral delivery systems. 

Preferably, the producer cell is obtainable from a stable producer cell line. 

10 Preferably, the producer cell is obtainable from a derived stable producer cell 
line. 

Preferably, the producer cell is obtainable from a derived producer cell line 

15 As used herein, tiie term "derived producer cell line" is a transduced producer cell 
line which has been screened and selected for high expression of a marker gme. 
Such cell lines contain retroviral insertions in integration sites that support high 
level expression from the retroviral genome. The term "derived producer cell 
line" is used interchangeably with the term "derived stable producer cell line" and 

20 the term "stable producer cell line 

Preferably the derived producer cell line includes but is not limited to a retroviral 
and/or a lentiviral producer cell. 

25 Preferably the derived producer cell line is an HTV or EIAV producer cell line, 
more preferably an EIAV producer cell line. 

Preferably the envelope protein sequences, and nucleocapsid sequences are all 
stably integrated in the producer and/or packaging cell. However, one or more of 
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these sequences could also exist in episomal form and gene expression, could 
occur from the episome. 

As used herein, the term "packaging cell" refers to a cell which contains those 
5 elements necessary for production of infectious recombinant virus which are 
lacking in a recombinant viral vector. Typically, such packaging cells contain 
one or more vectors which are capable of expressing viral structural proteins 
(such as codon optimised gag- pol and em) but they do not contain a packaging 
signal. 

10 

The term ^"packaging signal" which is referred to interchangeably as "packaging 
sequence" or "/7jf ' is used in reference to the non-coding, ciy-acting sequence 
required for encapsidation of retroviral RNA strands during viral particle 
formation, hi HIV-1, this sequence has been mapped to loci extending from 
1 5 upstream of the major splice donor site (SD) to at least the gag start codon. 

Packaging cell lines suitable for use with the above-described vector constructs 
may be readily prepared (see also WO 92/05266), and utilised to create producer 
cell lines for the production of retroviral vector particles. As aheady mentioned, a 
20 summary of the available packaging lines is presented in "Retroviruses" (1997 
Cold Spring Harbour Laboratory Press Eds: JM Coflfin, SM Hughes, HE Varmus 
pp449). 

Also as discussed above, simple packaging cell lines, comprising a provirus in 
25 which the packaging signal has been deleted, have been found to lead to the rapid 
production of undesirable replication competent viruses through recombination. 
In order to improve safety, second genemtion cell lines have been produced 
wherein the 3XTR of the provirus is deleted. In such cells, two recombinations 
would be necessary to produce a wild type virus. A further improvement involves 
30 the introduction of the gag-pol genes and the env gene on separate constructs so- 
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called third generation packaging cell lines. These constructs are introduced 
sequentially to prevent recombination during transfection (26; 27). 

Preferably, the packaging cell lines are second generation packaging cell lines. 

5 

Preferably, the packaging cell lines are third generation packaging cell lines. 

In these split-construct, third generation cell lines, a further reduction in 
recombination may be achieved by "codon wobbling". This technique, based on 
10 the redundancy of the genetic code, aims to reduce homology between the 
separate constructs, for example between the regions of overlap in the gag-pol 
and env open reading frames. 

The packaging cell lines are useful for providing the gene products necessary to 
IS encapsidate and provide a membrane protein for a high titre regulated retrovirus 
vector and regulated nucleic gene delivery vehicle production. When regulated 
retrovirus sequences are introduced into the packaging cell lines, such sequences 
are encapsidated with the nucleocapsid (gag-pol) proteins and these units then 
bud through the cell membrane to become surrounded in cell membrane and to 
20 contain the envelope protein produced in the packaging cell line. These 
infectious regulated retrovuuses are \iseful as infectious units per se or as gene 
delivery vectors. 

The packaging cell may be a cell cultured in vitro such as a tissue culture cell 
2S line. Suitable cell lines include but are not limited to mammalian cells such as 
murine fibroblast derived cell lines or human cell lines. Preferably the packaging 
cell line is a human cell line, such as for example: HEK293, 293-T, TE671, 
HT1080. 
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Alternatively, the packaging cell may be a cell derived from the individual to be 
treated such as a monocyte, macrophage, blood cell or fibroblast. The cell may 
be isolated from an individual and &e packaging and vector components 
administered ex vivo followed by re-administration of the autologous packaging 
5 cells. 

It is highly desirable to use high-titre virus preparations in both experimental and 
practical applications. Techniques for increasing viral titre include using a psi 
plus packaging signal as discussed above and concentration of viral stocks. In 

10 addition, the use of different envelope proteins, such as the G protein from 
vesicular-stomatitis virus has unproved titres following concentration to 10^ per 
ml (28). However, typically the envelope protein will be chosen such that the 
viral particle will preferentially infect cells that are infected with the virus which 
it desired to treat For example where an HTV vector is being used to treat HIV 

1 5 infection, the env protein used will be flie HIV env protein. 

The process of producing a retroviral vector in which the envelope protein is not 
the native envelope of the retrovirus is known as "pseudotyping". Certain 
envelope proteins, such as MLV envelope protein and vesicular stomatitis virus 
20 G (VSV-G) protein, pseudotype retroviruses very well. Pseudotyping is not a 
new phenomenon and examples may be found in WO-A-9 8/0575 9, WO-A- 
98/05754, WO-A-97/17457, WO-A-96/09400, WO-A-91/00047 and (29). 

As used herein, the term ^'high titre" means an effective amount of a retroviral 
25 vector or particle which is capable of transducmg a target site such as a ceU. 

As used herein, the term "effective amount" means an amount of a regulated 
retroviral or laativiral vector or vector particle which is sufficient to induce 
expression of an NOI at a target site. 

30 
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Preferably the titre is from at least 10 retrovirus particles per ml, such as from 
10° to 10' per ml, more preferably at least 10 retrovirus particles per ml. 

In accordance with the present mvention, it is possible to manipulate the viral 
5 genome or the regulated retroviral vector nucleotide sequence, so that viral genes 
are replaced or supplemented with one or more NOIs which may be heterologous 
NOIs. 

The term "heterologous" refers to a nucleic acid sequence or protein sequence 
10 linked to a nucleic acid or protein sequence which it is not naturally linked. 

With the present invention, the tenn NOI (i.e. nucleotide sequence of interest) 
includes any suitable nucleotide sequence, which need not necessarily be a 
complete naturally occuiring DNA sequence. Thus, the DNA sequence can be, 

15 for example, a synthetic DNA sequence, a recombinant DNA sequence (i.e. 
prepared by use of recombinant DNA techniques), a cDNA sequence or a partial 
genomic DNA sequence, including combinations thereof The DNA sequence 
need not be a coding region. If it is a coding region, it need not be an entire 
coding region. In addition, the DNA sequence can be in a sense orientation or in 

20 an anti-sense orientation. Preferably, it is in a sense orientation. Preferably, the 
DNA is or comprises cDNA. 

The NOI(s) may be any one or more of selection gene(s), marker gene(s) and 
therapeutic gene(s). 

25 

As used herein, the term "selection gene" refers to the use of a NOI which 
encodes a selectable marker which may have an enzymatic activity that confers 
resistance to an antibiotic or drug upon the cell in which the selectable marker is 
expressed. 

30 
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Many different selectable markers have been used successftdly in retroviral 
vectors. These are reviewed in '"Retroviruses" (1997 Cold Spring Harbour 
Laboratory Press Eds: JM Coffin, SM Hughes, HE Vannus pp 444) and include, 
but are not limited to, the bacterial neomycin (neo) and hygromycin 
5 phosphotransferase genes which confer resistance to G418 and hygromycin 
respectively; a mutant mouse dihydrofolate reductase gene which confers 
resistance to methotrexate; the bacterial gpt gene which allows cells to grow in 
medium containing mycophenoUc acid, xanthine and aminopterin; the bacterial 
hisD gene which allows cells to grow in medium without histidine but containing 

10 histidinol; the multidrug resistance gene (mdr) which confers resistance to a 
variety of drugs; and the bacterial genes which confer resistance to puromycin or 
phleomycin. All of these markers are dominant selectable and allow chemical 
selection of most cells expressing these genes. Other selectable markers are not 
dominant in that their use must be in conjunction with a cell line that lacks the 

15 relevant enzyme activity. Examples of non-dominant selectable maikrars include 
the thymidine kinase (tk) gene which is used in conjunction with tk cell lines. 

Particularly preferred markers are blasticidin and neomycin, optionally operably 
linked to a thymidine kinase coding sequence typically under the transcriptional 
20 control of a strong viral promoter such the S V40 promoter. 

In accordance with the present invention, suitable NOI sequences include those 
that are of therapeutic and/or diagnostic application such as, but are not limited 
to: sequences encoding cytokines, chemokines, hormones, antibodies, engineered 

25 immunoglobulin-like molecules, a single chain antibody, fusion proteins, 
enzymes, immune co-stimulatory molecules, inomunomodulatory molecules, anti- 
sense RNA, a transdominant negative mutant of a target protein, a toxin, a 
conditional toxin, an antigen, a tumour suppressor protein and growth factors, 
membrane proteins, vasoactive proteins and peptides, anti-viral proteins and 

30 ribozymes, and derivatives flierof (such as with an associated reporter group). 
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When included, such coding sequences may be typically operatively linked to a 
suitable promoter, which may be a promoter driving expression of a ribozyme(s), 
or a different promoter or promoters, such as in one or more specific cell types. 

5 Suitable NOIs for use in the invention in the treatment or prophylaxis of cancer 
include NOIs encoding proteins which: destroy the target cell (for example a 
ribosomal toxin), act as: tumour suppressors (such as wild-type p53); activators 
of anti-tumoxir immune mechanisms (such as cytokines, co-stimulatory molecules 
and immunoglobulins); inhibitors of angiogenesis; or which provide enhanced 
10 drug sensitivity (such as pro-drug activation enzymes); indirectly stimulate 
destruction of target cell by natural effector cells (for example, strong antigen to 
stimulate the immune system or convert a precursor substance to a toxic 
substance which destroys the target cell (for example a prodrug activating 
enzyme). 

15 

Examples of prodrugs include but are not limited to etoposide phosphate (used 
with alkaline phosphatase; 5-fluorocytosine (with cytosine deaminase); 
Doxorubin-N-p-hydroxyphenoxyacetamide (with Penicillia-V-Amidase); Para-N- 
bis (2-chloroethyl)aininobenzoyl glutamate (with Carboxypeptidase G2); 
20 Cephalosporin nitrogen mustard carbamates (with B-lactamase); SR4233 (with 
p450 reductase); Ganciclovir (with HSV thymidine kinase); mustard pro-drugs 
with nitroreductase and cyclophosphamide or ifosfamide (with cytochrome 
p450). 

25 Suitable NOIs for use in the treatment or prevention of ischaemic heart disease 
include NOIs encoding plasminogen activators. Suitable NOIs for the treatment 
or prevention of rheumatoid arthritis or cerebral malaria include genes encoding 
anti-inflammatory proteins, antibodies directed against tumour necrosis factor 
(TNF) alpha, and anti-adhesion molecules (such as antibody molecules or 

30 receptors specific for adhesion molecules). 
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The expression products encoded by the NOIs may be proteins which are 
secreted from the cell. Alternatively the NOI expression products are not 
secreted and are active within the cell. In either event, it is preferred for the NOI 
5 expression product to demonstrate a bystander effect or a distant bystander effect; 
that is the production of the expression product in one cell leading to the killing 
of additional, related cells, either neighbouring or distant (e.g. metastatic), which 
possess a common phenotype. Encoded proteins could also destroy bystander 
tumour cells (for example with secreted antitumour antibody-ribosomal toxin 

10 fusion protein), indirectly stimulated destruction of bystander tumour cells (for 
example cytokines to stimulate the immune system or procoagplant proteins 
causing local vascular occlusion) or convert a precursor substance to a toxic 
substance which destroys bystander tumour cells (eg an enzyme which activates a 
prodrug to a diffusible drug). Also, the deUvery of NOI(s) encoding antisense 

15 transcripts or ribozymes which interfere with expression of cellular genes for 
tumour persistence (for example against aberrant myc transcripts in Burkitts 
lymphoma or against bcr-abl transcripts in chronic myeloid leukemia The use of 
combioations of such NOIs is also envisaged. 

20 The NOI or NOIs of the present invention may also comprise one or more 
cytokine-encoding NOIs. Suitable cytokines and growth factors include but are 
not limited to: ApoE, Apo-SAA, BDNF, Cardiotrophin-l, EOF, ENA-78, 
Eotaxia, Eotaxin-2, Exodus-2, FGF-acidic, FGF-basic, fibroblast growth factor- 
ID (30). FLT3 ligand, Fractalkine (CX3C), GDNF, G-CSF, GM-CSF, GF-pi, 

25 msulm, IFN-y, IGF-I, IGF-II, IL-la, DL-lp, IL-2, IL-3, IL.4, IL-5, IL-6, IL-7, IL- 
8 (72 a.a.), IL-8 (77 a.a.), IL-9, IL-10, IL-11, IL-12, IL-13, IL-15, IL-16, IL-17, 
IL-18 (IGIF), Mdbin a, Inhibin p, DP-IO, keratmocyte growth factor-2 (KGF-2), 
KGF, Leptin, LIF, Lymphotactin, Mullerian inhibitory substance, monocyte 
colony inhibitory factor, monocyte attractant protein (30 ibid), M-CSF, MDC (67 

30 a.a.), MDC (69 a.a.), MCP-1 (MCAF), MCP-2, MCP-3, MCP-4, MDC (67 a.a.). 
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MDC (69 a.a.), MIG, MlP-la, MIP-ip, MlP-Sa, MIP-3p, MIP-4, myeloid 
progenitor inhibitor factor-1 (MPIF-1), NAP-2, Neurturin, Nerve growth factor, 
P-NGF, NT-3, NT-4, Qncostatin M, PDGF-AA, PDGF-AB, PDGF-BB, PF^, 
RANTES, SDFla, SDFip, SCF, SCGF, stem cell factor (SCF), TARC, TGF^a, 
5 TGF-p, TGF-P2, TGF-p3, tumour necrosis factor (TNF), TNF-a, TNF-p, TNIL- 
1, TPO, VEGF, GCP-2, GRO/MGSA, GRO-p, GRO-y, HCCl, 1-309. 

The NOI or NOIs may be under the expression control of an expression 
regulatory element, such as a promoter and/or a promoter rahancer as known as 
1 0 ^'responsive elements" m the present invention. 

When the regulated retroviral vector particles are used to transfer NOIs into cells 
which they transduce, such vector particles also designated 'Viral delivery 
systems** .or '^retroviral delivery systems". Viral vectors, including retroviral 

15 vectors, have been used to transfer NOIs efficiently by exploiting the viral 
transduction process. NOIs cloned into the retroviral genome can be delivered 
efficiently to cells susceptible to transduction by a retrovirus. Through other 
genetic manipulations, the replicative capacity of the retroviral genome can be 
destroyed. The vectors introduce new genetic material into a cell but are unable 

20 to replicate. 

The regulated retroviral vector of the present invention can be delivered by viral 
or non-viral techniques. Non-viral delivery systems mclude but are not limted to 
DNA transfection methods. Here, transfection includes a process using a non- 
25 viral vector to deliver a gene to a target mammalian cell. 

Typical transfection methods include electropomtion, DNA biolistics, lipid- 
mediated transfection, compacted DNA-mediated transfection, Upbsomes, 
immunoliposomes, lipofectin, cationic agent-mediated, cationic facial 
30 amphiphiles (CPAs) (31), multivalent cations such as spermine, cationic lipids or 



35 



wo 01/79518 



PCT/GBOl/01784 



polylysine, 1, 2,-bis (oleoyloxy)0-(trimethylaininomo) propane (DOTAP)- 
cholesterol complexes (32) aad combinations thereof. 

Viral delivery systems include but are not limited to adenovirus vector, an adeno- 
5 associated viral (AAV) vector, a herpes viral vector, a retroviral vector, a 
lentiviral vector, or a baculoviral vector. These viral delivery systems may be 
configured as a split-intron vector. A spUt intron vector is described in WO 
99/15683. 

10 Other examples of vectors mclude ex vivo delivery systems, which include but 
are not limited to DNA transfection methods such as electroporation, DNA 
biolistics, lipid-mediated transfection, compacted DNA-mediated transfection. 

The vector may be a plasmid DNA vector. Altematively, the vector may be a 
15 recombinant viral vector. Suitable recombinant viral vectors include adenovirus 
vectors, admo-associated viral (AAV) vectors. Herpes-virus vectors, or retroviral 
vectors, lentiviral vectors or a combination of adenoviral and lentiviral vectors. 
In the case of viral vectors, gene delivery is mediated by viral infection of a target 
cell 

20 

If the features of adenoviruses are combined with the genetic stability of 
retro/lentiviruses then essentially the adenovirus can be used to transduce target 
cells to become transient retroviral producer cells that could stably infect 
neighbouring cells. 

25 

The present invention also provides a pharmaceutical composition for treating an 
individual by gene therapy, wherein the composition comprises a therapeutically 
effective amount of a regulated retroviral vector according to the present 
invention. The pharmaceutical composition may be for hiraian or animal usage. 
30 Typically, a physician will determine the actual dosage which will be most 
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suitable for an incfividual subject and it will vary with the age, weight and 
response of the particular patient 

The composition may optionally comprise a phannaceutically acceptable carrier, 
5 diluent, excipient or adjuvant' The choice of pharmaceutical carrier, excipient or 
diluent can be selected with regard to the intended route of administration and 
standard pharmaceutical practice. The pharmaceutical compositions may 
comprise as - or in addition to - the carrier, excipient or diluent any suitable 
binder(s), lubricant(s), suspending agent(s), coating agent(s), solubilising 
10 agent(s), and other carrier agents that may aid or increase the viral entry into the 
target site (such as for example a lipid delivery system). 

Where appropriate, the pharmaceutical compositions can be administered by any 
one or more of: minipumps, inhalation, in the form of a suppository or pessary, 

15 topically in the form of a lotion, solution, cream, ointment or dusting powder, by 
use of a skin piatch, orally in the form of tablets containing excipients such as 
starch or lactose, or in capsules or ovules either alone or in admixture with 
excipients, or in the form of elixirs, solutions or suspensions containing 
flavouring or colouring agents, or they can be injected parenterally, for example 

20 intracavemosally, intravenously, intramuscularly or subcutaneously. For 
parenteral administration, the compositions may be best used in the form of a 
sterile aqueous solution which may contain other substances, for example enough 
salts or monosaccharides to make the solution isotonic with blood. For buccal or 
sublingual administration the compositions may be administered in the form of 

25 tablets or lozenges vAdch can be formulated in a conventional manner. 

The present invention is believed to have a wide therapeutic applicability - 
depending on inter alia the selection of the one or more NOIs. 
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For example, tixe present invention may be useful in the treatment of the disorders 
listed in WO-A-98/05635. For ease of reference, part of that list is now provided: 
cancer, inflammation or inflammatory disease, dermatological disorders, fever, 
cardiovascular effects, haemorrhage, coagulation and acute phase response, 
5 cachexia, anorexia, acute infection, HTV infection, shock states, graft-versus-host 
reactions, autoimmxme disease, reperfusion injury, meningitis, migraine and 
aspirin-dependent anti-thrombosis; tumour growth, invasion and spread, 
angiogenesis, metastases, malignant, ascites and malignant pleural effusion; 
cerebral ischaemia, ischaemic heart disease, osteoarthritis, rheumatoid arthritis, 

10 osteoporosis, asthma, multiple sclerosis, neurodegeneration, Alzheimer's disease, 
atherosclerosis, stroke, vasculitis, Crohn's disease and ulcerative colitis; 
periodontitis, gmgivitis; psoriasis, atopic dermatitis, chronic ulcers, epidermolysis 
bullosa; corneal ulceration, retinopatiiy and surgical wound healing; rhinitis, 
allergic conjxmctivitis, eczema, anaphylaxis; restenosis, congestive heart failure, 

1 5 endometriosis, atherosclerosis or endosclerosis. 

In addition, or in the alternative, the present invention may be useful in the 
treatment of disorders listed in WO-A-98/07859. For ease of reference, part of 
that list is now provided: cytokine and cell proliferation/differentiation activity; 

20 immunosuppressant or immunostimulant activity (e.g. for treating immune 
deficiency, including infection with human immune deficiency virus; regulation 
of lymphocyte growth; treating cancer and many autoimmune diseases, and to 
prevent transplant rejection or induce tumour immunity); regulation of 
haematopoiesis, e.g. treatment of myeloid or lymphoid diseases; promoting 

25 growth of bone, cartilage, tendon, ligament and nerve tissue, e.g. for healing 
woimds, treatment of bums, ulcers and periodontal disease and 
neurodegeneration; inhibition or activation of follicle-stimulating hormone 
(modulation of fertility); chemotactic/chemokinetic activity (e.g. for mobilising 
specific cell types to sites of injury or infection); haemostatic and thrombolytic 

30 activity (e.g. for treating haemophilia and stroke); antiinflancunatory activity (for 
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treating e.g. septic shock or Crohn's disease); as antimicrobials; modulators of 
e.g. metabolism or behaviour; as analgesics; treating specific deficiency 
disorders; in treatment of e.g. psoriasis, in human or veterinary medicine. 

5 In addition, or in the alternative, the present invention may be useful in flie 
treatment of disorders listed in WO-A-98/09985. For ease of reference, part of 
that list is now provided: macrophage inhibitory and/or T cell inhibitory activity 
and thus, anti-inflammatory activity; anti-immune activity, i.e. inhibitory effects 
against a celMar and/or humoral immune response, including a response not 

10 associated with inflammation; inhibit the ability of macrophages and T cells to 
adhere to extracellular matrix components and fibronectin, as well as up- 
regulated fas receptor expression in T cells; inhibit unwanted immune reaction 
and inflammation including arthritis, including rheumatoid arthritis, inflammation 
associated with hyperseositivity, allergic reactions, asthma, systemic Ixrpus 

15 erythematosus, collagen diseases and other autoiimnune diseases, inflammation 
associated with atherosclerosis, arteriosclerosis, atherosclerotic heart disease, 
reperfosion injury, cardiac arrest, myocardial infarction, vascular inflammatory 
disorders, respiratory distress syndrome or other cardiopuhnonary diseases, 
inflammation associated with peptic ulcer, ulcerative colitis and other diseases of 

20 the gastrointestinal tract, hepatic fibrosis, liver cirrhosis or other hepatic diseases, 
thyroiditis or other glandular diseases, glomerulonephritis or other renal and 
urologic diseases, otitis or other oto-rhino-laryngological diseases, dermatitis or 
other dermal diseases, periodontal diseases or other dental diseases, orchitis or 
epididimo-orchitis, infertility, orchidal trauma or other immune-related testicular 

25 diseases, placental dysfunction, placental insufficiency, habitual abortion, 
eclampsia, pre-eclampsia and other immune and/or inflammatory-related 
gynaecological diseases, posterior uveitis, intermediate uveitis, anterior uveitis, 
conjunctivitis, chorioretinitis, uveoretmitis, optic neuritis, intraocular 
inflammation, e.g. retinitis or cystoid macular oedema, sympathetic ophthalmia, 

30 scleritis, retinitis pigmentosa, immune and inflammatory components of 
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degenerative fondus disease, iaflammatory components of ocular trauma, ocular 
inflammation caused by infection, proliferative vitreo-retinopathies, acute 
ischaemic optic neuropathy, excessive scarring, e.g. following glaucoma fQtration 
operation, immune and/or inflammation reaction against ocular implants and 
5 other immune and inflammatory-related ophthalmic diseases, mflammation 
associated with autoimmune diseases or conditions or disorders where, both in 
the central nervous system (CNS) or in any other organ, immune and/or 
inflammation suppression would be beneficial, Parkinson's disease, compUcation 
and/or side effects fi:om treatment of Parkinson's disease, AIDS-related dementia 

10 complex HTV-related encephalopathy, Devic's disease, Sydenham chorea, 
Alzheimer's disease and otibier degenerative diseases, conditions or disorders of 
the CNS, inflammatory components of stokes, post-polio syndrome, immune and 
inflammatory components of psychiatric disorders, myeHtis, encephalitis, 
subacute sclerosing^ pan-encephalitis, encephalomyelitis, acute neuropathy, 

15 subacute neuropathy, chronic neuropathy, Guillaim-Barre syndrome, Sydenham 
chora, myasthenia gravis, pseudo-tumour cerebri, Down's Syndrome, 
Huntmgton's disease, amyotrophic lateral sclerosis, inflammatory components of 
CNS compression or CNS trauma or infections of the CNS, inflammatory 
components of muscular atrophies and dystrophies, and immune and 

20 inflammatory related diseases, conditions or disorders of the central and 
peripheral nervous systems, post-traumatic inflammation, septic shock, infectious 
diseases, inflammatory complications or side effects of surgery, bone marrow 
transplantation or other transplantation complications and/or side efifects, 
inflammatory and/or immune complications and side effects of gene therapy, e.g. 

25 due to infection with a viral carrier, or ioflanmiation associated with AIDS, to 
suppress or iahibit a humoral and/or cellular immune response, to treat or 
ameliorate monocyte or leukocyte proliferative diseases, e.g. leukaemia, by 
reducing the amount of monocytes or lymphocytes, for the prevention and/or 
treatment of graft rejection in cases of transplantation of natural or artificial cells, 

30 tissue and organs such as cornea, bone marrow, organs, lenses, pacemakers. 
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natural or artificial skin tissue. 

The invention will now be further described by way of Examples, which are meant 
to serve to assist one of ordinary skill in the art in carrying out the invention and 
5 are not intended in any way to limit the scope of the inventioiL The Examples refer 
to the Figures. In the Figures: 

Description of the Figures 

10 Figure 1 shows schematically how to create a suitable 3 ' LTR by PCR4 

Figure 2 shows tiie codon usage table for wild type HIV gag-poI of strain HXB2 
(accession number: K0345S); 

1 S Figure 3a shows the codon usage table of the codon optimised sequence designated 
gagpol-S YNgp. Figure 3b shows a comparative codon usage table; 

Figure 4 shows the codon usage table of the wild type HIV env called env-mn; 

20 Figure 5 shows the codon usage table of the codon optimised sequence of HIV env 
designated SYNgpl60mn; 

Figure 6 shows two plasmid constmcts for use in the invention; 

25 Figure 7 shows the principle behind two systems for producing retroviral vector 
particles; 

Figure 8 shows a sequence comparison between the wild type HIV gag-pol 
sequence (pGP-RRE3) and the codon optimised gag-pol sequence (pSYNGP); 

30 
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Figure 9 shows a sequence comparison between the wild type EIAV gag-poI 
sequence (WT) and the codon optimised gag-pol sequence (CO); 

Figure 1 0 shows Rev independence of protein expression particle formation; 

5 

Figure 1 1 shows translation rates of wild-type (WT) and codon optimised gag-pol; 

Figure 12 shows gag-pol mRNA levels in total and cytoplasmic fractions; 

10 Figure 13 shows the effect of insertion of WT gag downstream of the codon 
optinoised gene on RNA and protein levels; 

Figure 14 shows the plasmids used to study the effect of HIV-1 gag on fb& codon 
optimised gene; 

15 

Figure 15 shows the effect on cytoplasmic RNA of insertion of HIV-1 gag 
iq>stream of the codon optimised gene; 

Figure 16 shows the effect of Leptomycin B (LMB) on protein production; . 

20 

Figure 17 shows the cytoplasmic RNA levels of the vector genomes; 
Figure 1 8 shows transduction efficiency at MOI 1 ; 
25 Figure 19 shows a schematic representation of pGP-RRE3; 
Figure 20 shows a schematic representation of pS YNGP; 
Figure 21 shows vector titres generated with different gag-pol constructs; 

30 

42 



wo 01/79518 



PCT/GBOl/01784 



Figure 22 shows vector titres bom the Rev/RRE (-) and (+) genomes; 

Figure 23 shows vector titres fiom the pHS series of vector genomes; 

5 Figure 24 shows vector titres for the pHS series of vector genomes m the presence 
or absence of Rev/RRE; 

Figure 25 shows an analysis of gag-pol constructs; 
1 0 Figure 26 shows a Western blot of 293T extracts; 

Figure 27 is a schematic representation of pESYNGP; 
Figure 28 is a schematic representation of LpESYNGP; 

15 

Figure 29 is a schematic rq)resentation of LpESYNGPRRE; 
Figure 30 is a sdiemiatic representation of pESYNGPRRE; 
20 Figure 3 1 is a schematic representation of pONY4.0Z; 
Figure 32 is a schematic representation of pONY8.0Z; 
Figure 33 is a schematic representation of pONYS.lZ; 

25 

Figure 34 is a schematic representation of pONY3.1; 
Figure 35 is a schematic representation of pCIneoERev; 
30 Figure 36 is a schematic representation of pESYNREV; 
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Figures 37 and 38 show the effect of different vector constructs on viral vector 
titres; 

5 Figures 39 and 40 show tbe effect of different vector constructs on RT activity; 
Figure 41 shows the effect of the 5* leader sequence on viral vector titre; 
Figure 42 shows viral vector titres when usiag pONYS.lZ; 

10 

Figure 43 shows a comparison between the sequences of pONY3.1 and codon 
optimised pONY3.20Fn in the first 372 nucelotides of gag; 

Figure 44 is a schematic representation of pIRESlhygESYNGP; 

15 

Figure 45 and 46 show the results of experiments to confirm that codon optimised 
gag'pol can be used in the production of packaging and producer cell lines; 

Figures 47 and 48 show the results of experiments which confirm that RNA firom 
20 codon optimised gag-pol is packaged less efficiently than that &om the wild-type 
gene; 

Figure 49 shows the results of an experiment which confirms that expression firom 
pES YNGP and pESDSYNGP are similar; 

25 

Figure 50 is a schematic representation of pESDSYNGP; and 

Figure 51 shows the results of an experiment which confirms that the eflficiency of 
encapsidaling gag-pol RNA in PEV-17 cells and B-241 cells in similar. 

30 
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In more detail. Figure 8 shows a sequence comparison between the wild type HIV 
gag-pol sequence (pGP-RRE3) and the codon optimised gag-pol sequence 
(pSYNGP) A^erein the upper sequence represents pS YNGP and the lower sequence 
represents pGP-RRE3. 

5 

Figure 10 shows Rev independence of protein e?q)ression particle formation. 5|ig of 
the gag'pol expression plasmids were transfected into 293T cells in the presence or 
absence of Rev (pCMV-Rev, and protein levels were determined 48 hours post 
transfection in culture supematants (A) and cell lysates (B), HIV-1 positive human 

10 serum was used to detect the gag-pol proteins. The blots were re-probed with an 
anti-actin antibody, as an internal control (C). The protein marker (New England 
Biolabs) sizes (in kDa) are shown on the side of the gel. Lanes: 1 . Mock transfected 
293T ceUs, 2. pGP-RRE3, 3. pGP-RRE3 + pCMV-Rev, 4. pSYNGP, 5. pSYNGP + 
pCMV-Rev, 6. pSYNGP-RRE, 7. pSYNGP-RRE +pCMV-Rev, 8. pSYNGP-ERR, 

15 9. pSYNGP-ERR + pCMV-Rev. 

Figure 1 1 shows translation rates of WT and codon optimised gag-poL 293T cells 
were transfected with 2\ig pGP-RRE3 (+/- Ijig pCMV-Rev) or 2\xg pSYNGP. 
Protein samples from culture supematants (A) and cell extracts (B) were analysed 

20 by Western blotting 12, 25, 37 and 48 hours post-transfection. HIV-1 positive 
human serum was used to detect gag-pol proteins (A, B) and an anti-actin 
antibody was used as an internal control (C). The protein marker sizes are shown 
on the side of the gel (in kD). A Phosphorimager was used for quantification of 
the results. Lanes: 1. pGP-RRE3 12h, 2. pGP-RRE3 25h, 3. pGP-RRE3 37h, 4, 

25 pGP-RRE3 48h, 5. pGP-RRE3 + pCMV-Rev 12h, 6. pGP-RRE3 +pCMV-Rev 
25h, 7. pGP-RRE3 + pCMV-Rev 37h, 8. pGP-RRE3 + pCMV-Rev 48h, 9, 
pSYNGP 12h, 10. pSYNGP 25h, 11. pSYNGP 37h, 12. pSYNGP 48h, 13. Mock 
transfected 293T cells. 
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Figure 12 shows gag-pol mRNA levels in total and cytoplasmic fractions. Total 
and cytoplasmic RNA was extracted from 293T cells 36 hours after transfection 
with 5jig of the gag-pol expression plasmid (+/- l^tg pCMV-Rev) and mRNA 
levels were estimated by Northern blot analysis. A probe complementary to nt 
5 1222-1503 of both the wild type and codon optiutnised gene was used. Panel A 
shows the band corresponding to the HIV-1 gag-pol The sizes of the mRNAs are 
4.4 kb for the codon optimised and 6 kb for the wild type gene. Panel B shows 
the band corresponding to human ubiquitin (internal control for normalisation of 
results). Quantification was performed using a Phosphorimager. Lane numbering: 
10 c indicates cytoplasmic fraction and t indicates total RNA fraction. Lanes: L 
pGP-RRE3 , 2. pGP"RRE3 + pCMV-Rev, 3. pSYNGP, 4. pSYNGP + pCMV- 
Rev, 5. pSYNGP-RRE, 6. pSYNGP-RRE + pCMV-Rev, 7. Mock transfected 
293T cells, 8. pGP-RRE3 + pCMV-Rev, 9. Mock transfected 293T cells, 10. 
pSYNGP. 

15 

Figure 13 shows the effect of insertion of WT gag downstream of the codon 
optimised gene on RNA and protein levels. The wt gag sequence was inserted 
downstream of the codon optimised gene m both orientations {Notl site), 
resulting in plasmids pSYN6 (correct orientation, see Figure 14) and pSYN7 

20 (reverse orientation, see Figure 14). The gene encoding for p-galactosidase 
(LacZ) was also inserted in the same site and the correct orientation (plasmid 
pSYN8, see Figure 14). 293T cells were transfected with 5 |ig of each plasmid 
and 48 hours post transfection mRNA and protein levels were determined as 
previously described by means of Northem and Western blot analysis 

25 respectively. 

Northem blot analysis in cytoplasmic RNA fractions. The blot was probed with a 
probe complementary to nt 1510-2290 of the codon optimised gene (T) and was 
re-probed with a probe specific for human ubiquitin (11). Lanes: 1. pSYNGP, 2. 
30 pSYN8, 3. pSYN7, 4. pSYN6 
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Western blot analysis: HTV-l positive human serum was used to detect the gag- 
pol proteins (I) and an anti-actm antibody was used as an internal control (II). 
Lanes: Cell lysates: 1. Mock transfected 293T cells, 2. pGP-RRE3 + pCMV-Rev, 
3. pSYNGP, 4. pSYN6, 5. pSYN7, 6. pSYN8. Supematants: 7. Mock teansfected 
5 293T cells, 8, pGP-RRE3 + pCMV-Rev, 9. pSYNGP, 10. pSYN6, 11. pSYN7, 
12. pSYN8. The protein marker (New England Biolabs) sizes are shown on the 
side of the gel. 

Figure 14 shows the plasmids used to study the effect of HIV-1 gag on the codon 
10 optimised gene. The backbone for all constructs was pCI-Neo. Syn gp: The 
codon optimised HIV-1 gag-poI gene. HXB2 gag: The wild type HTV-l gag 
gene. HXB2 gagj: The wild type HTV-l gag gene in the reverse orientation. 
HXB2 gagAATG: The wild type HIV-1 gag gene without the gag ATG. HXB2 
gag'fr.sh.: The wild type HIV-1 gag gene with a ftameshift mutation. HXB2 gag 
15 625-1503: Nucleotides 625-1503 of the wild type fflV-1 gag gene. HXB2 gag 1- 
625: Nucleotides 1-625 of the wild type fflV-1 gag gene. 

Figure 15 shows the eflfect on cytoplasmic KNA of insertion of HTV-l gag 

upstream of the codon optimised genfe. Cytoplasmic RNA was extracted 48 hours 
20 post transfection of 293T cells (5 [ig of each pS YN plasmid was used and 1 \ig of 

pCMV-Rev was co-transfected in some cases). The probe that was used was 

designed to be complementary to nt 1510-2290 of the codon optimised gene (T). 

A probe specific for human ubiquitin was used as an internal control (II). 

Lanes: L pSYNGP , 2. pSYN9, 3. pSYNlO, 4. pSYNlO + pCMV-Rev, 5. 
25 pSYNl 1, 6. pSYNl 1 + pCMV-Rev, 7. pCMV-Rev. 

Lanes: 1. pSYNGP , 2. pSYNGP-RRE, 3. pSYNGP-RRE + pCMV-Rev, 4. 

pSYN12, 5. pSYN14, 6. pSYN14 + pCMV-Rev, 7. pSYN13, 8. pSYN15, 9. 

pSYN17, 10. pGP-RRE3, 11. pSYN6, 12. pSYN9, 13. pCMV-Rev. 
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Figure 16 shows the effect of LMB on protein production. 293T cells were 
transfected with Ifig pCMV-Rev and 3\ig of pGP.RRES/pSYNGP/pSYNGP- 
RRE (+/- Ijtg pCMV-Rev). Transfections were done in duplicate. 5 hours post 
transfection the medium was replaced with fresh medium in the first set and with 
fresh medium containing 7.5 nM LMB in the second. 20 hours later the cells 
were lysed and protein production was estimated by Western blot analysis. HIV-1 
positive human serum was used to detect the gag-pol proteins (A) and an anti- 
actm antibody was used as an internal control (B). Lanes: 1. pGP-RRE3, 2. pGP- 
RRE3 + LMB, 3. pGP.RRE3 + pCMV-Rev, 4. pGP-RRE3 + pCMV-Rev + 
LMB, 5. pSYNGP, 6. pSYNGP + LMB, 7. pSYNGP + pCMV-Rev, 8. pSYNGP 
+ pCMV-Rev + LMB, 9, pSYNGP-RRE, 10. pSYNGP-RRE + LMB, 11. 
pSYNGP-RRE + pCMV.Rev, 12. pSYNGP-RRE + pCMV-Rev + LMB. 

Figure 17 shows the cytoplasmic RNA levels of the vector goDiomes. 293T cells 
were transfected with 10 jig of each vector genome. Cytoplasmic RNA was 
extracted 48 hours post transfection. 20 ng of RNA were used from each sample 
for Northern blot analysis. The 700bp probe was designed to hybridise to all 
vector genome KNAs (see Materials and Methods). Lanes: 1. pH6nZ, 2. pH6nZ + 
pCMV-Rev, 3. pH6.1nZ, 4. pH6.1nZ + pCMV-Rev, 5. pHSlnZ, 6. pHS2nZ, 7. 
pHS3nZ, 8. pHS4nZ, 9. pHSSnZ, 10. pHS6nZ, 11. pHS7nZ, 12. pHSSnZ, 13. 
pCMV-Rev. 

Figure 18 shows transduction efficiency at MOI 1. Viral stocks were generated 
by co-transfection of each gag-pol expression plasmid (5 or 0.5 pg), 15 jig 
pH6nZ or pHS3nZ (vector genome plasmid) and 5 \ig pHCMVG (VSV envelope 
e3q)ression plasmid) on 293T cells. Virus was concentrated as previously 
described (45) and transduction efficiency was determined at m.o.i.*s 0.01-1 on 
HT1080 cells. There was a linear correlation of transduction eflSciency and m.o.i. 
in all cases. An indicative picture at m.o.i. 1 is showa here. Transduction 
eflBciency was >80% with either genome, either gag-pol and either high or low 
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amounts of pSYNGP. litres before concentration 0.U./ml): on 293T cells: A. 
6.6x10^ B. 7.6x10^ C. 9.2x10^ D. 1.5x10^ on im080 cells: A. 6.0xl0^ B. 
9.9xl0^ C. 8.0xlO^ D. 2.9x10"^. litres after concentration (I.U./ml) on HT1080 
cells: A. 6.0xl0^ B. 2.0xlO^ C. 1.4xl0^ D. 2.0xl0^ 

5 

Figure 21 shows vector titers obtained with differed gag-pol constructs. Viral 
stocks were generated by co-transfection of each gag-pol expression plasmid, 
pH6nZ (vector genome plasmid) and pHCMVG (VSV envelope expression 
plasmid, 2.5\ig for each transfection) on 293T cells, litres (I.U./ml of virs stock) 
10 were measured on 293T cells by counting the number of blue colonies following 
X-Gral staining 48 hours after transduction. E^eriments were performed at least 
' twice and the variation betvsreen experiments was less than 1 S%. 

Figure 22 shows vector titres fiom the Rev/EiRE (-) and (+) genomes: The retroviral 
1 5 vectors were generated as described in the Examples. Titres (LUJrol of viral stock + 
SD) were determined in 293T cells. 

Figure 23 shows vector titres ftom the pHS series of vector genomes. The retroviral 
vector was generated as described in the Examples. Titres (I.U./ml of viral stock + 
20 SD) were determined in 293T cells. Rev is provided ftom pCMV-Rev. Note that 
pH6nZ expresses Rev and contains the RRE. None of the other genomes express 
Rev or contain the RRE. Expression from pSYNGP is Rev independent, whereas it 
is Rev dependent for pGP-RRE3. 

25 Figure 24 shows vector titres for the pHS series of vector genomes in the presence 
or absence of Rev/RRE. The retroviral vector was generated as described in the 
Examples. 5 of vector genome, 5 |xg of pSYNGP and 2,5 p-g of pHCMVG were 
used and titres (I.U./ml) were determined in 293T cells. Experiments were 
performed at least twice and tiie variation between experiments was less than 15%. 

30 Rev is provided from pCMV-Rev (1 \ig). Note that pH6nZ expresses Rev and 
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contains the RRE. None of the pHS genomes expresses Rev and only pHSlnZR, 
pHSSnZR, pHS7nZR and pH6.1nZR contain the RRE. gag-pol e3q)ression ftom 
pSYNGP is Rev independent 

5 Figure 26 shows a Western blot of 293T extracts wherein 30:g of total cellular 
protein was separated by SDS/P^e electrophoresis, transferred to nitro-cellulose * 
and probed with anti EIAV antibodies. The secondary antibody was anti-Horse 
HRP (Sigma). 

10 In Figure 38 the titres are shown m lacZ forming units (L.F.U.)/ml. The vectors 
used are indicated in boxes above the bars. 

For ease of reference, we also set out the sequences listed in the accompanying 
Sequence Listing: 

SEQ ID N0:1 shows the sequence of the wild-type gag-pol sequence for the strain 
15 HXB2 (accessionno. K03455); 

SEQ ID N0:2 shows the sequence of pSYNGP; 

SEQ ID NO:3 shows the sequence of the Envelope gene for HIV-1 MN (Genbank 
accessionno. Ml 7449); 

SEQ ID NO:4 shows the sequence of SYNgp-160mn - codon optimised env 
20 sequence; 

SEQ ID N0:5 shows the sequence of pESYNGP; 

SEQ ID N0:6 shows the sequence of LpESYNGP; 

SEQ ID N0:7 shows the sequence of pESYNGPRRE; 

SEQ ID N0:8 shows the sequence of LpESYNGPRRE; 
25 SEQ ID N0:9 shows the sequence of pONY4.0Z; 

SEQ ID NO:10 shows the sequence of pONYS.OZ; 

SEQIDN0:11 shows the sequence of pONY8.1Z; 

SEQ ID N0:12 shows the sequence of pONY3.1; 

SEQ ID NO: 13 shows the sequence of pCIneoERev; 
30 SEQ ID N0:14 shows the sequence of pESYNREV; 
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SEQ ID NO: 1 5 shows the sequence of codon optimised HIV gag-poI; 
SEQ ID NO: 1 6 sho\?vs the sequence of codon optmied EIAV gag-poI; 
SEQ ID N0:17 shows the sequence of pIRESlhygESYNGP; 
SEQ ID NO: 1 8 shows the sequence of pESDSYNGP; and 
5 SEQ ID N0:19 shows Ihe sequence of pONY8.3GFB29(-). 

Example 1 - HIV 
Cell lines 

10 293T cells (33) and HeLa cells (34) were maintained m Dubecco*s modified Eagle's 
medium containing 10% (v/v) fetal calf serum and si5)plemen[ted with L-glutamine 
and antibiotics (penicillm-streptomycin). 293T cells were obtained firom D. 
Baltimore (Rockefeller University). 

IS mV- 1 proviral clones 

Proviral clones pWI3 (35) and pNL4-3 (36) were used. 
Construction of a Packagu^ System 

20 

In one of the present examples, a modified codon optimised HIV env sequence is 
used (SEQ I.D. No. 4). The corresponding env ejqpression plasmid is designated 
pSYNgpldOmn. The modified sequence contains extra motifs not used by (37). 
The extra sequences were taken fi:om the HIV em sequence of strain MN and codon 
25 optimised. Any sinailar modification of the nucleic acid sequence would fimction 
similarly as long as it used codons corresponding to abundant tRNAs (38). 
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Codon optimised HIV-1 gag-pol gene 

A codon optimised gag-pol gene, shown from nt 1108 to 5414 of SEQ ID NO: 2 
was constructed by annealing a series of short overlapping oligonucleotides 
5 (approximately 30-40mers with 25% overlap, i.e. approximately 9 nucleotides). 
Oligonucleotides were purchased from RifeD SYSTEMS (R&D Systems Europe 
Ltd, 4-10 The Quadrant, Barton Lane, Abingdon, 0X14 SYS, UK). Codon 
optimisation was performed using the sequence of HXB-2 strain (AC: K03455) 
(39). The Kozak consensus sequence for optimal translation initiation (40) was also 

10 included. A fragment from base 1222 from the beginning of gag until the end of ^ag 
(1503) was not optimised ia order to maintain the frameshift site and the overlap 
between the gag and pol reading frames. This was from clone pNL4-3. (When 
referring to base numbers within the gag-pol gene base 1 is the A of the gag ATG, 
which corresponds to base 790 from the beginning of the HXB2 sequence. When 

15 referring to sequences outside the gag-pol then the numbers refer to bases from the 
beginning of the HXB2 sequence, where base 1 corresponds to the begiiming of the 
5' LTR). Some deviations from optimisation were made in order to introduce 
convenient restriction sites. The final codon usage is shown in Figure 3b, which 
now resembles that of highly expressed human genes and is quite different from that 

20 of the wild type HIV-1 gag-pol The gene was cloned into the mammalian 
expression vector pCIneo (Promega) in the EcoSI-Notl sites. The resulting plasmid 
was named pSYNGP ^Figure 20, SEQ ID No 2). Sequencing of the gene in both 
strands vrafied the absence of any mistakes. A sequence comparison between the 
codon optimised and wild type HIV gag-pol sequence is shown in Figure 8. 

25 

Rev/RKE constructs 

The fflV-l RRE sequence (bases 7769-8021 of the HXB2 sequence) was amplified 
by PGR from pWD proviral clone with primers bearing the Notl restriction site and 
30 was subsequently cloned into the Notl site of pSYNGP. The resulting plasmids 
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were named pSYNGP-RRE (RRE in the correct orientation) and pSYNGP-ERR 
(RRE in the reverse orientation). 

Pseudotyped viral particles 

5 

In one form of the packaging system a synthetic gag-pol cassette is coexpressed 
with a heterologous envelope codmg sequence. This could be for example VSV-G 
(44, 45), amphotropic MLV env (46, 47) or any other protein that would be 
incorporated into the HIV or EIAV particle (48), This includes molecules capable 
1 0 of targeting the vector to specific tissues. 

HlV-l Vector genome constructs 

pH6nZ is derived from pH4Z (49) by the addition of a single nucleotide to place 
15 an extra guanine residue that was missing from pH4Z at the 5' end of the vector 
genome transcript to optimise reverse transcription. In addition the gene coding 
for P-galactosidase (LacZ) was replaced by a gene encoding for a nuclear 
localising p-galactosidase. (We are grateful to Enca Martin-Rendon and Said 
Ismail for providing pH6nZ). In order to construct Rev(-) genome constructs the 
20 following modifications were made : a) A 1.8 kb Pstl - PstI fragment was 
removed from pH6nZ, resulting in plasmid pH6.1nZ and b) an EcdNl (filled) - 
SphI firagment was substituted with a Spel (filled) - SphI fragment from the same 
plasmid (pH6nZ), resulting in plasmid pH6.2nZ. In both cases sequences within 
gag (nt 1-625) were retained, as they have been shown to play a role in packaging 
25 (93). Rev, RRE and any other residual env sequences were removed. pH6.2nZ 
further contains the env splice acceptor, whereas pH6.1nZ does not. 

A series of vectors encompassing further gag deletions plus or minus a mutant 
major splice donor (SD) (GT to CA mutation) were also derived from pH6Z. 
30 These were made by PGR with primes bearing a JVorl (5* primers) and an Spel 
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(3' primers) site. The PGR products were inserted into pH6Z at the Narl - Spel 
sites. The resulting vectors were named pHSlnZ (containing HIV-1 sequences up 
to gag 40), pHS2nZ (containing HIV-1 sequences up to gag 260), pHS3nZ 
(containing HTV-l sequences up to gag 360), pHS4nZ (containing HTV-l 
5 sequences up to gag 625), pHSSnZ (same as pHSlnZ but wilii a mutant SD), 
pHS6nZ (same as pHS2nZ but with a mutant SD), pHS7nZ (same as pHSSnZ but 
with a mutant SD) and pHSSnZ (same as pHS4nZ but witii a mutant SD). 

In addition, the RRE sequence (nt 7769-8021 of the HXB2 sequence) was 
10 inserted in the Spel (filled) site of pH6.1nZ, pHSlnZ, pHS3nZ and pHS7nZ 
resulting in plasmids pH6.1nZR, pHSlnZR, pHS3nZR and pHS7nZR 
respectively. 

Other modifications to the genome have been made including the generation of a 
15 SIN vector (by deletion of part of the 3' U3), the replacement of the LTRs with 
those from MLV or replacement of part of the 3'U3 with the MLV U3 region. 

Transient transfections, transductions and determination of viral titres 

20 These were performed as previously described (49, 50). Briefly, 293T cells were 
seeded on 6cm dishes and 24 hours later they were transiently transfected by 
overnight calcium phosphate treatment. The medium was replaced 12 hours post- 
transfection and imless otherwise stated supematants were harvested 48 hours 
post-transfection, filtered (through 0,22 or 0.45 \im filters) and titered by 

25 transduction of 293T cells. For this reason supernatant at appropriate dilutions of 
the original stock was added to 293T cells (plated onto 6 or 12 well plates 24 
hours prior to transduction). 8 |J.g/ml Polybrene (Sigma) was added to each well 
and 48 hours post transduction viral titres were determined by X-gal staining. 
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Luminescent p-galactosidase (p-gal) assays 

These were performed on total cell extracts using a luminescent p-gal reporter 
system (CLONTECH). Untransfected 293T cells were used as negative control and 
5 293T cells transfected with pCMV-p-gal (CLONTECH) were used as positive 
control. 

RNA analysis 

10 Total or cytoplasmic RNA was extracted jfrom 293T cells by using the RNeasy mini 
kit (QUIAGEN) 36-48 hours post-transfectioa 5-10 of RNA was subjected to 
Northern blot analysis as previously described (51). Correct fractionation was 
verified by staining of the agarose gel. A probe complementary to bases 1222-1503 
of the gag'pol gene was amplified by PGR fi:om HTV-l pNL4-3 proviral clone and 

15 was used to detect both the codon optimised and wild type gag-pol mKNAs. A 
second probe, complementary to nt 1510-2290 of the codon optimised gene was 
also amplified by PCR fcom plasmid pSYNGP and was used to detect the codon 
optimised genes only. A 732 bp fiagment complementary to all vector genomes 
used in this study was prepared by an j^el-^vrll digestion of pH6nZ. A probe 

20 specific for ubiquitin (CLONTECH) was used to normalise the results. All probes 
were labelled by random labelling (STRATAGENE) with a-^^P dCTP (Amersham). 
The results were quantitated by using a Storm Phosphorhnager (Molecular 
Dynamics) and shown in Figure 12. In the total cellular fi:actions the 47S rRNA 
precursor could be clearly seen, wiiereas it was absent from the cytoplasmic 

25 firactions. As expected (52), Rev stimulates the cytoplasmic accumulation of wild 
type gag-pol mRNA (lanes Ic and 2c). RNA levels were 10-20 fold higher for the 
codon optimised gene compared to the wild type one, both in total and cytoplasmic 
fractions (compare lanes 3t-2t, 3c-2c, 10c-8c). The RRE sequence did not 
significantiy destabilise the codon optimised RNAs since RNA levels were similar 

30 for codon optimised RNAs whether or not they contained the RRE sequence 



55 



wo 01/79518 



PCT/GB01/D1784 



(compare lanes 3 and 5). Rev did not markedly enhance cytoplasmic accumulation 
of the codon optimised gag-pol mRNAs, even when they contained flie RRE 
sequence (differences in RNA levels were less than 2-fold, compare lanes 3-4 or 5- 
6). 

5 

It appeared from a comparison of Figures 10 and 12 that all of the increase in 
protein expression from syngp could be accounted for by the increase in RNA 
levels. In order to mvestigate whether this was due to saturating levels of RNA m 
tihie cell, we liansfected 0.1, 1 and 10 of the wild type or codon optimised 

10 expression vectors into 293T cells and compared protein production. la all cases 
protein production was 10-fold higher for the codon optimised gene for the same 
amount of transfected DNA, vMe increase in protein levels was proportional to the 
amount of transfected DNA for each individual gene. It seems likely therefore that 
the enhanced repression of tiie codon optimised gene can be mainly attributed to the 

1 5 enhanced RNA levels present in the cytoplasm and not to iacreased translation. 
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Protein analysis 

Total cell lysates were prepared from 293T cells 48 hours post-transfection 
(unless otherwise stated) with an alkaline lysis buffer. For extraction of proteins 
5 from cell siq)ematants the supernatant was first passed through a Q22\xm filter 
and Hxc vector particles were collected by centrifiigation of 1 ml of supernatant at 
21,000 g for 30 minutes. Pellets were washed with PBS and then re-suspended in 
a small volume (2-10 11) of lysis buffer. Equal protein amounts were separated on 
a SDS 10-12% (v/v) polyacrylamide gel. Proteins were transferred to 

10 nitrocellulose membranes which were probed sequentially with a 1:500 dilution 
of HIV-1 positive human serum (AIDS Reagent Project, ADP508, Panel E) and a 
1:1000 dilution of horseradish peroxidase labelled anti-himian IgG (Sigma, 
A0176). Proteins were visualised usmg the BCL or ECL-plus western blotting 
detection reagent (Amersham). To verify equal protein loading, membranes were 

15 stripped and re-probed with a 1:1000 dilution of anti-actin antibody (Sigma, 
A2066), followed by a 1:2000 dilution of horseradish peroxidase labelled anti- 
rabbit IgG (Vector Laboratories, PI-1000). 

Expression of gag-pol gene products and vector particle production 

20 

The wild type gag-pol (pGP-RRE3 - Figure 19) (49), and codon optimised 
expression vectors (pSYNGP, pSYNGP-RRE and pSYNGP-ERR) were 
transiently transfected into 293T cells. Transfections were performed in the 
presence or absence of a Rev e^qpression vector, pCMV-Rev (53), in order to 

25 assess Rev-dependence for expression. Western blot analysis was performed on 
cell lysates and supematants to assess protein production. The results are shown 
in Figure 10. As expected (54), expression of the wild type gene is observed only 
when Rev is provided in trans (lanes 2 and 3). In contrast, when the codon 
opthnised gag-pol was used, there was high level expression in both the presence 

30 and absence of Rev (lanes 4 and 5), indicating that in this system th«re was no 
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requirement for Rev. Protein levels were higher for the codon optimised gene 
than for the wild type gag-pol (compare lanes 4-9 with lane 3). The difference 
was more evident in the cell supematants (approximately 10-fold higher protein 
levels for the codon optimised gene compared to the wild type one, quantitated 
5 by using a Phosphorlmager) than in the cell lysates. 

In previous studies where the RRE has been included in gag-pol expression 
vectors that had been engineered to remove INS sequences, inclusion of the RRE 
lead to a decrease in protein levels, that was restored by providing Rev in trans 
10 (55). In our hands, the presence of the RRE in the fixlly codon optimised gag-pol 
mRNA did not affect protein levels and provision of Rev in trans did not further 
enhance expression (lanes 6 and 7). 

In order to compare translation rates betwem the wild type and codon optimised 
15 gene, protein production from the wild type and codon optimised expression 
vector was determined at several time intervals post transfection into 293T cells. 
Protein production and particle formation was determined by Western blot 
analysis and the results are shown in Figure 11. Protein production and particle 
formation was 10-fold higher for the codon optimised gag-pol at all time points. 

20 

To further determine whether this enhanced expression that was observed with 
the codon optimised gene was due to better translation or due to effects on the 
RNA, RNA analysis was carried out. 

25 The efficiency of vector production using the codon optimised gag-pol gene 

To determine the effects of the codon optimised gag-pol on vector production, we 
used an HIV vector genome, pH6nZ and the VSV-G envelope expression 
plasmid pHCMVG (113), in combination with either pSYNGP, pSYNGP-RRE, 
30 pSYNGP-ERR or pGP-RRE3 as a source for the gag-pol in a plasmid ratio of 2 : 
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1 : 2 in a 3 plasmid co-transfection of 293T cells (49). Whole ceM extracts and 
culture supematants were evaluated by Western blot analysis for the presence of 
the gag and gag-pol gene products. Particle production was, as expected (Figure 
10), 5-10 fold higher for the codon optimised genes when compared to the wild 
5 type. 

To determine the effects of the codon optimised gag-pol gene on vector titres, 
several ratios of the vector components were used. The results are shown in 
Figure 21. Where the gag-pol was the limiting component in the system (as 
determined by the drop in titres observed with the wild type geue), titres were 10- 
fold higher for Ihe codon optimised vectors. This is m agreement with the higher 
protein production observed for these vectors, but suggests that under normal 
conditions of vector production gag-pol is saturating and the codon optimisation 
gives no maximum yield advantage. 



10 



15 



The effect of HIV-l gag INS sequences on the codon optimised gene is 
position dependent 

It has previously been demonstrated that insertion of wUd type HIV-l gag 
20 sequences downstream of other RNAs, e.g. HIV-l tat (56), HIV-l gag (55) or 
CAT (57) can lead to a dramatic decrease m steady state mRNA levels, 
presumably as a result of the INS sequences. In other cases. e.g. for p-globm 
(58), it was shown that the effect was splice site dependent CeUular ARBs (AU- 
rich elements) that are found in the 3' UTR of labile mRNAs may confer mRNA 
25 destabilisation by inducing cytoplasmic deadeaylation of the transcripts (59). To 
test whether HTV-l gag INS sequences would destabilise the codon optimised 
RNA, flie wild-type HIV-l gag sequence, or parts of it (nt 1-625 or nt 625-1503), 
v^re amphfied by PGR from the proviral clone pW13. All fragments were blunt 
ended and were inserted into pSYNGP or pSYNGP-RRE at either a blunted 
EcdRl or Notl site (upstream or downstream of the codon optimised gag-pol 
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gene repectively). As controls the wt HIV-1 gag in the reverse orientation (as 
INS sequences have been shown to act in an orientation dependent manner, (57) 
(pSYNT) and lacZ, excised from plasmid pCMV-Pgal (CLONTECH) (in the 
correct orientation) (pSYN8) were also inserted in the same site. Contrary to our 
5 expectation, as shown in Figure 13, the wild type HIV-1 gag sequence did not 
appear to significantiy affect RNA or protein levels of the codon optimised gene. 
We further constructed another series of plasmids (by PGR and from the same 
plasmids) where the wild type HTV-l gag in the sense or reverse orientation, 
subfiagments of gag (nt 1-625 or nt 625-1503), Ihe wild type HIV-1 gag without 

10 the ATG or with a fiameshift mutation 25 bases downstream of the ATG, or nt 
72-1093 of LacZ (excised from plasmid pH6Z), or the first 1093 bases of lacZ 
with or without the ATG were inserted upstream of the codon optimised HIV-1 
gag'pol gene in pSYNGP and/or pSYNGP-RRE (pSYN9-pSYN22, Figure 14), 
Northern blot analysis showed that insertion of the wild type HIV-1 gag gene 

15 upstream of the codon optimised HTV-l gag-pol (pSYN9, pSYNlO) lead to 
duninished RNA levels in the presence or absence of Rev/RRE (Figure 15 A, 
lanes 1-4 and Figure 15B, lanes 1+12). The effect was not dependent on 
translation as insertion of a wild type HTV-l gag lacking the ATG or with a 
frameshift mutation (pSYN12, pSYN13 and pSYN14) also diminished RNA 

20 levels (Figure 15B, lanes 1-7). Western blot analysis verified that there was no 
HTV-l gag translation product for pS YN12-14. However, it is possible that, as the 
wt HIV-1 gag exhibits such an adverse codon usage, it may act as a non- 
translatable long 5' leader for syngp^ and if this is the case, then the ATG 
mutation should not have any effects. 

25 

Insertion of smaller parts of the wild type HIV-1 gag gene (pSYN15 and 
pSYN17) also lead to a decrease in RNA levels (Figure 15B, lanes 1-3 and 8-9), 
but not to levels as low as when the whole gag sequence was used (lanes 1-3, 4-7 
and 8-9 in Figure 15B). This indicates that the effect of INS sequences is 
30 dependent on their size. Insertion of the wild type HTV-l gag in the reverse 
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orienlation (pSYNl 1) had no efifect on RNA levels (Figure 15A, lanes 1 and 5-6). 
However a spUcing event seemed to take place in that case, as indicated by the 
size of the RNA (equal to the size of the codon optimised gag-pol RNA) and by 
the translation product (gag-pol, in equal amounts compared to pSYNGP, as 
5 verified by Westem blot analysis). 

These data indicate therefore that wild type HTV-l gag instability sequences act 
in a position and size dependent manner, probably irrespective of translation. It 
should also be noted that ihe RRE was unable to rescue the destabilised RNAs 
1 0 through interaction with Rev. 



15 



20 



30 



Constroction of an HIV-1 based vector system that lacks aU the accessoiy 
proteins 



Until now several HIV-1 based vector systems have been reported that lack all 
accessory proteins but Rev (49, 60). We wished to investigate whether the codon 
optimised gene would permit the construction of an HIV-1 based vector system that 
lacks all accessory proteins. We initially deleted rev/KRE and any residual em 
sequences, but kept the first 625 nucleotides of gag, as they have been shown to 
play a role in efficient packaging (61). Two vector genome constmcts were made, 
pH6.1nZ (retaining only HIV sequences lip to nt 625 of ga^) and pH6.2nZ (same as 
pHe.lnZ, but also retaming the ew spKce acceptor). Hiese were derived fiom a 
conventional HIV vector genome that contains RRE and expresses Rev (pH6nZ). 
Our 3-plasmid vector system now expressed only HTV-l gag-pol and the VSV-G 
25 envelope proteins. Vector particle tilres were determined as described in the 
previous section. A ratio of 2 : 2 : 1 of vector genome (pH6Z or pH6.1nZ or 
pH6.2nZ) : gag-pol expression vector (pGP-RRE3 or pSYNGP) : pHCMV-G was 
used. Transfections were perfomied in the presence or absence of pCMV-Rev, as 
gag-pol expression was still Rev dependent for flie wild type gene. Tlie results are 
summarised in Figure 22 and indicate that an HIV vector could be produced in tiie 
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total absence of Rev, but that maximum titres were compromised at 20-fold lower 
than could be achieved in the presence of Rev. As gag-pol expression should be the 
same for pSYNGP with pH6nZ or pH6.1nZ or pH6.2nZ (since it is Rev 
independent), as well as for pGP-RKE3 when Rev is provided in trans, we 
5 suspected that the vector genome retained a requirement for Rev and was therefore 
limiting the titres. To confirm this. Northern blot analysis was perfomied on 
cytoplasmic RNA prepared from cells transfected with pH6nZ or pH6.1nZ in the 
presence or absence of pCMV-Rev. As can be seen in Figure 17, lanes 1-4, the 
levels of cytoplasmic RNA derived from pH6nZ were 5-10 fold higher than those 
10 obtained with pH6.1nZ (compare lanes 1-2 to lanes 3-4). These data support the 
notion that RNA produced from the vector genome requires the Rev/RRE system to 
ensure high cytoplasmic levels. This may be due to inefi5cient nuclear e:q)ort of the 
. RNfA, as INS sequences residing within gag were still present 

15 Further deletions in the gag sequences of the vector genome might therefore be 
necessary to restore titres. To date efScient packaging has been reported to require 
360 (62) or 255 (63) nucleotides of gag in vectors that still retain em sequences, or 
about 40 nucleotides of gag in a particdar combination of splice donor mutation, 
gag and em deletions (64, 63). In an attempt to remove the requirement for 

20 Rev/RRE in our vector genome without compromising efficient packaging we 
constructed a series of vectors derived from pH6nZ containing progressively larger 
deletions of HIV-1 sequences (only sequences upstream and within gag were 
retained) plus and minus a mutant major splice donor (SD) (GT to CA mutation). 
Vector particle titres were determined as before and the results are summarised in 

25 Figure 23. As can be seen, deletion of up to nt 360 in gag (vector pHS3iiZ) resulted 
in an increase in titres (compared to pH6.1nZ or pH62nZ) and only a 5-fold 
decrease (titres were 1.3-1.7 x 10^) compared to pH6nZ. Further deletions resulted 
in titres lower than pHS3nZ and similar to pH6.1nZ. In addition, the SD mutation 
did not have a positive effect on vector titres and in the case of pHS3nZ it resulted 

30 in a 10-fold decrease in titres (compare titres for pHS3nZ and pHS7nZ in Figure 
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23). Northern blot analysis on cytoplasmic KNA (Figure 17, laaes 1 and 5-12) 
showed that RNA levels were indeed higher for pH6nZ, which could account for 
the mayimiim tittes observed with this vector. RNA levels were equal for pHSlnZ 
(lane 5), pHS2nZ (lane 6) and pHS3nZ (lane 7) whereas titres were 5-8 fold higher 
5 for pHS3nZ. It is possible that further deletions (than that found m pHS3nZ) m gag 
might result in less efficient packaging (as for HIV-1 the packaging signal extends 
in gag) and therefore even though all 3 vectors produce similar amounts of RNA 
only pHS3nZ retains maximum packaging efficiency. It is also interesting to note 
that the SD mutation resulted in increased RNA levels in the cytoplasm (compare 

10 lanes 6 and 10, 7 and 1 1 or 8 and 12 in Figure 17) but equal or decreased titres 
(Figure 23). The GT dinucleotide that was mutated is in the stem of SL2 of the 
packaging signal (65). It has been reported that SL2 might not be very important for 
HIV-1 RNA encapsidation (65, 66), whereas SL3 is of great importance (67). 
Folding of the wild type and SD-mutant vector sequences with the RNAdraw 

15 software program revealed that the mutation alters significantly the secondary 
structure of the RNA and not only of SL2. It is likely therefore that although the SD 
mutation enhances cytoplasmic RNA levels it does not increase titres as it alters the 
secondary structure of the packaging signal. 

20 To investigate whether the titre differences that were observed with the Rev minus 
vectors were indeed due to Rev dependence of the genomes, the RRE sequence (nt 
7769 - 8021 of the HXB2 sequence) was inserted in the j^^el site (downstream of 
the gag sequence and just upstream of the internal CMV promoter) of pH6.1nZ> 
pHSlnZ, pHS3nZ and pHS7nZ, resulting in plasmids pH6.1nZR, pHSlnZR, 

25 pHS3nZR and pHS7nZR respectively. Vector particle titres were deterauned with 
pSYNGP and pHCMVG in flie presence or absence of Rev (pCMV-Rev) as before 
and the results are summarised in Figure 24. In the absence of Rev titres were 
further compromised for pH6.1nZR (7-fold compared to pH6.1nZ), pHS3nZR (6- 
fold compared to pHS3nZ) and pHS7nZR (2.5-fold compared to pHS7nZ). This 

30 was expected, as the RRE also acts as an mstabiUty sequence (68) and so it would 
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be expected to confer Rev-dependence. In the presence of Rev titres were restored 
to the maxunnm titres observed for pH6nZ in the case of pHS3nZR (5 x 10^) and 
pH6.1nZR (2 x 10^). Titres were not restored for pHS7nZR in the presence of Rev. 
This supports the hypothesis that the SD mutation in pHS7nZ affects the structure of 
5 the packaging signal and thxis the packaging ability of this vector genome, as in this 
case Rev may be able to stimulate vector genome RNA levels, as for pHSSnZR and 
pH6.1nZR, but it can not aJBfect the secondary structure of the packaging signal. For 
vector pHSlnZ inclusion of the RRE did not lead to a decrease in titres. This could 
be due to the feet that pHSlnZ contains only 40 nucleotides of gag sequences and 

10 therefore even with the RRE the size of instability sequences is not higher than for 
pHS2nZ that gives equal titres to pHSlnZ. Rev was able to partially restore titres 
for pHSlnZR (10-fold increase when compared to pHSlnZ and 8-fold lower than 
pH6nZ) but not fully as in the case of pHS3nZ. This is also in agreement with the 
hypothesis that 40 nucleotides of HIV-1 gag sequences might not be sufficient for 

15 efficient vector RNA packaging and this could account for the partial and not 
complete restoration in titres observed with pHSlnZR in the presence of Rev. 

In addition, end-point titres were determined for pHS3nZ and pH6nZ with 
pSYNGP in HeLa and HT1080 human cell lines. In both cases titres followed the 

20 pattern observed in 293T cells, with titres being 2-3 fold lower for pHS3nZ than 
for pH6nZ (See Figure 10). Finally, transduction efficiency of vector produced 
with pHS3nZ or pH6nZ and different amounts of pSYNGP or pGP-RRE3 at 
different m.o.i.'s (and as high as 1) was deterauned m HT1080 cells. This 
experiment was performed as the high level gag-pol expression from pSYNGP 

25 may result in interference by genome-empty particles at high vector 
concentrations. As expected for. VSVG pseudotyped retroviral particles (69) 
transduction efficiencies correlated with the m.o.i.'s, whether high or low 
amounts of pSYNGP were used and with pH6nZ or pHS3nZ. For m.oi. 1 
transduction efficiency was approximately 50-60% in all cases (Figure 18). The 
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above data indicate that no interference due to genome-empty particles is 
observed in this experimental system. 

The codon optimised gag-pol gene does not use the exportin-1 nuclear export 
5 pathway 

Rev mediates the export of unspKced and singly spliced HIV-1 mRNAs via the 
nuclear export receptor exportin-1 (CRMl) (70, 71, 72, 73, 74). Leptomycin B 
(LMB) has been shown to inhibit leucine-rich NES mediated nuclear export by 

10 disrupting flie formation of the exportm-l/NES/RanGTP complex (75, 72). In 
particular, LMB inhibits mcleo-cytoplasmic translocation of Rev and Rev- 
dependent HIV mRNAs (76). To investigate whether exportin-1 mediates the 
export of the codon optimised gag-pol constructs, the effect of LMB on protein 
production was tested. Western blot analysis was performed on cell lysates from 

15 cells transfected with the gag-pol constracts (+/- pCMV-Rev) and treated or not 
With LMB (7.5 nM, for 20 hours, beginning treatment 5 hours post-transfection). 
To confirm that LMB had no global effects on transport, the expression of p-gal 
from the control plasmid pCMV-pGal was also measured. An actin internal 
control was used to account for protein variations between samples. The results 

20 are shown in Figure 16. As expected (76), the wild type gag-pol was not 
expressed in the presence of LMB (compare lanes 3 and 4), whereas LMB had no 
effect on protein production from the codon optimised gag-pol, irrespective of 
the presence of the RRE in the transcript and the provision of Rev in trans 
(compare lanes 5 and 6, 7 and 8, 9 and 10, 11 and 12, 5-6 and 11-12). The 

25 resistance ofthe expression ofthecodon-optimisedgfl!g-;?o/ to inhibition by LMB 
indicates that the exportin-1 pathway is not used and therefore an alternative 
export pathway must be used. This offers a possible explanation for the Rev 
independent expression The fact that the presence of a nonfunctional Rev/RRE 
interaction did not affect expression impUes that the RRE does not necessarily act 
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as an inhibitory (e.g. nuclear retention) signal per se, which is in agreement with 
previous observations (5, 58). 

In conclusion, this is the first report of an HTV-l based vector system, composed 
5 of pSYNGP, pHS3nZ and pHCMVG, where significant vector production can be 
achieved in the absence of all accessory proteins. These data indicate that in order 
to achieve maximum titres the HTV vector genome must be configured to retain 
efificient packaging and that this requires the retention of gag sequences and a 
splice donor. By reducing the gag sequence to 360 nt in pHS3nZ and combining 
10 this with pSYNGP it is possible to achieve titre of at least 10^ LU./ml that is only 
5-fold lower than the maximum levels achieved in the presence of Rev. 

Example 2 "EIAV 

Codon-optimised ElAVgag-pol eq>ression cassettes 

15 

We also examined if ttie codon-optimisation process would alter the properties of 
the gag-pol gene of the non-primate lentivirus EIAV. The sequence is of the codon- 
optimised gene is shown fi'om ntllOB to 5760 of SEQ ID N0:5 (Figure 9). The 
wild type and the codon-optimised sequences are denoted WT and CO, respectively. 

20 The codon usage was changed to that of highly expressed mammalian genes. 
pESYNGP (Figure 27 and SEQ ID N0:5) was made by transferring an XbahNotl 
firagment firom a plasmid containing a codon-optimised EIAV gag/pol gene, 
synthesised by Operon Technologies Inc., Alameda, CA, mto pCIneo (Promega). 
The gene was supplied in a proprietary plasmid backbone, GeneOp. The 

25 firagment transferred to pCIneo includes sequences flanking the codon-optimised 
EIAV gag/pol ORF: tctagaGAATTCGCCACCATG- EIAV gag/pol- 
TGAACCCGGGgcggccgc. The ATG start and TGA stop codons are shown in 
bold and the recognition sequences formal and Notl sites in lower case. 
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The expression of Gag/Pol from the codon-optimised gene was assessed with 
respect to that from various wild type EIAV gag/pol e3q)ression constructs by 
transient transfection of HEK 293T cells (Figure 25). Transfections were carried 
out using the calcium phosphate technique, using equal moles of each Gag/Pol 
5 expression plasmid together with a plasmid which expressed EIAV Rev either 
from the wild type sequence or from a codon-optimised version of the gene: 
pCIneoEREV (WO 99/32646) (Figure 35 and SEQ ID N0:13) or pESYNREV 
(Figure 36 and SEQ ID N0:14), respectively. pESYNREV is a pCIneo-based 
plasmid (Promega) which was made by introducing the EcoRI to SalL fragment 

10 from a synthetic EIAV REV plasmid, made by Operon Technologies Alameda, 
CA. The plasmid backbone was the proprietary plasmid GeneOp in which was 
inserted a codon-optimised EIAV REV gene flanked by EcoRI and SalL 
recognition sequences and a Kozak consensus sequence to drive efficient 
translation of the gene. The mass of DNA on each transfection was equalised by 

15 addition of pCIneo plasmid. In transfections in which a Rev expression plasmid 
was omitted, a similar mass of pCIneo (Promega) was used instead (lanes 
labelled pCIneo). Cytoplasmic extracts were prepared 48 hours post transfection 
and IS^ig amounts of protein were fractionated by SDS-PAGE and then 
transferred to Hybond ECL. The Western blot was probed with a polyclonal 

20 antisera from an EIAV-infected horse and then with a secondary antibody, anti- 
horse horse-radish peroxidase conjugate. Development of the blot was carried 
out using the ECL kit (Amersham). Positive controls for the blotting and 
development procedure, and cytoplasmic extmct from untransfected HEK 293T 
cells are as indicated. The positions of various EIAV proteins are indicated. ^ 

25 

Expression from wild type gag/pol was achieved from various plasmids (see 
Figure 25). pONY3.2T is a derivative of pONYS.l (W099/32646)(Figure 34 
and SEQ ID NO: 12) in which mutations which ablate expression of Tat and S2 
have been made. In addition, the EIAV sequence is truncated downstream of the 
30 second exon of r^. Specifically, expression of Tat is ablated by an 83nt deletion 
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in exon 2 of tat which corresponds with respect to the wild type EIAV sequence, 
Acc. No. U01866, to deletion of nt 5234-5316 inclusive. S2 ORF expression is 
ablated by a 51nt deletion, corresponding to nt 5346-5396 of Acc. No. U01866. 
The EIAV sequence is deleted downstream of a position corresponding to nt 
5 7815 of Acc. No. U01866. These alterations do not alter rev, hence expression of 
this gene is expressed as for pONY3.1. pONY3.2 OPTI is a derivative of 
pONY3.1 which has the same deletions for ablation of Tat and S2 expression as 
described above. In addition, the first 372nt of gag have been *codon-optimised' 
for expression in human cells. The sequence of the wild type and codon- 

10 optLtnised sequences present in pONY3.20PTI in this region are compared in 
Figure 43. Base differences between the sequences are indicated. The region 
which was codon-optimised represents the region of overlap between the vector 
and wild-type gag/pol expression constructs. Reduction of homology within fliis 
region would be e^qpected to improve the safety profile of the vector system due 

15 to the reduced chances of recombination between the vector genome and the 
gag/pol transcripts. 3.2 OPTI-Ihyg is a derivative of 3.2 OPTI in which the 
SndQhNotl fragment of 3.2 OPTI is transferred to pIRESlhygro (Clontech) 
prepared for ligation by digestion with the same sites. The gag/pol gene is thus 
placed upstream of the IRES hygromycin phosphotransferase. Of note is tiie fact 

20 that the resulting construct contains the intron from pCIneo, not from 
pIRESlhygro. pEV53B is a derivative of PEV53A (WO 98/51810) in which tiie 
EIAV-derived sequence upstream of the Gag initiation codon is reduced to 
include only the major splice donor and surrounding seqeunces: 
CAG/GTAAGATG, where the Gag initiation codon is shown in bold fece. 

25 

The results (Figure 26) shown the Rev-dependence of Gag/Pol expression from 
pH0RSE3.1 (WO 99/32646), which has an EIAV derived leader sequence 
starting just downstream of the primer binding site and an RRE placed 
downstream of gag/pol composed of the two EIAV sequences reported to have 
30 RRE activity. Expression was enhanced by the same amount when Rev 
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expression was driven by vwld type (pCIneoERev) (Figure 35) or codon- 
optimised (pESYNREV) (Figure 36) genes. This result confirms the 
functionality of the codon-optimised Rev expression plasmid. 

In contrast to expression of Gag/Pol fi-om pONYS.l, expression from pESYNGP 
was not influenced by the presence of Rev, however it was slightly lower than 
from pONYS.l or pON3.2T. Expression from pESYNGPRRE (Figure 30 and 
SEQ ID N0:7), in which the EIAV RRE sequence present in pH0RSE3.1 is 
placed downstream of gag/pol, appeared slightly lower than from pESYNGP. 
The levels of expression fix>m 3.2 OPTI and 3.20PTI-Ihyg were significantly 
lower than from pESYNGP or pONY3.1, even in the presence of Rev. This 
result suggested that there may be determinants of Gag/Pol expression within the 
first 372nt of the gag and showed that 32 OPTI was unlikely to be useful as a 
basis for EIAV vector production. Furthermore it demonstrates that codon- 
optimisation of only certain regions of the whole gag/pol gene may not lead to 
higji levels of Rev-independent expression. 



We have previously demonstrated (43) that the 5' leader (121bp upstream of the 
ATG start codon) and the RRE sequence (43) are important for high expression of 
20 the wild type EIAV g^flg-po/. Three constructs were made that contained either the 
leader sequence (LpESYNGP), the leader and RRE sequences (LpESYNGPRRE) 
or the RRE sequence (pESYNGPRRE). Tbs sequences of these constructs are 
shown in SEQ ID NOS:6-8 and Figures 28-30. They were transfected into 293T 
cells in either the presence or absence of Rev expression plasmid. The cefl 
supernatant was then measured for reverse transcriptase activity (RT), using a 
conventional RT assay, to evaluate which construct generated the highest amount of 
gag-poI mRNA. The results are shown in Figures 39 and 40. It is clear from these 
results that the 5' leader leads to an increase in RT activity. The ability of these 
Gag/T»ol expression constructs to support formation of infectious vector particles 
was also tested by transient transfection of HEK 293 ceUs. The results of this 
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analysis of show that all of the constructs could pro\dde functional EIAV Gag/Pol, 
and show the Rev dependence of litre with the pONYS.OZ vector genome plasmid, 
\^ch does not encode any EIAV proteins (Figure 41). 

5 The ability of pESYNGP to act in concert with a minimal EIAV vector genome 
plasmid pONYS.lZ (Figure 33, SEQ ID N0:11) was evaluated (Figure 42). The 
result shows that the titres obtained with pESYNGP and pONYS.lZ are about 10- 
fold lower than from pONYS.l and pONYS.lZ. This reduced tilre reflects the lack 
of Rev protein in the system rather than a deficiency of Gag/Pol production which 
10 we have already shown is independent of Rev expression. 

Expression of EIAV Gag/Pol was also tested firom pESDSYNGP (Figure 50 and 
SEQ ID NO: 18) in which the Kozak consensus sequence of Gag is replaced by 
the natural EIAV splice donor. pESDSYNGP was made from pESYNGP by 

15 exchange of the 306bp EcoRl-Nhel fragment, which runs from just upstream of 
the start codon for gag/pol to approximately 300 base pairs inside the gag/pol 
ORF with a 308bp EcoRl-Nhel fragment derived by digestion of a PGR product 
. made usmg pESYNGP as template and using the following primers: SD FOR 
[GGCTAGAGAATTCCAGGTAAGATGGGCGATCCCCTCACCTGG] and SD 

20 REV [TTGGGTACTCCTCGCTAGGTTC]. This manipulation replaces the 
Kozak concensus sequence upstream of the ATG in pESYNGP with the spUce 
donor found in EIAV. The sequence between the EcdSl site and the ATG of 
gag/pol is thus CAGGTAAG, exactly as found in the natural viral sequence. 
Therefore the mRNA is deleted with respect to sequences upstream but not 

25 downstream of the splice donor. The performance of pESDSYNGP was assessed 
relative to pESYNGP and other expression plasmids by measurement of reverse 
transcriptase activity in supematants from transiently transfected HEK 293T cells 
usmg a Taqman-based version of the product enhanced reverse transcriptase 
(PERT) assay. In this method, reverse transcriptase associated with vector 

30 particles is released by mild detergent treatment and used to synthesize cDNA 
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using MS2 bacteriophage RNA as template. MS2 RNA template and primer are 
present in excess hence the amount of cDNA is proportional to the amount of RT 
released from the particles. Therefore, the amount of cDNA synthesised is 
proportional to the number of particles. MS2 cDNA is then quantitated using 
5 Taqman technology. The assay is carried out on test samples in parallel with a 
vector stock of known titre and estimated particle content. The use of the 
standard allows creation of a 'standard curve' and allows the relative RT content 
of various samples to be calculated. The results of this analysis are shown in 
Figure 49. The results show that Gag/Pol expression is virtually identical from 

10 pESYNGP and pESDSYNGP. The results also indicate that expression is not 
significantly enhanced by Rev. The activity of the Rev e3q)ression plasmid is 
confirmed by the result obtained with pHORSE +, in which there is an RRE 
downstream of the wild type EIAV gag/poI, and that shows a 6-fold enhancement 
of expression in flie presence of Rev. We also noted that the expression firom 

15 pHORSE was enhanced 3-fold in the presence of Rev. Since this constract has 
no RRE it suggests that Rev may be having a non-specific enhancing effect on 
expression, possibly as a result of being expressed at high levels in this 
experimental system. 

20 The ability of pESYNGP to participate in the formation of infectious viral vector 
particles, when co-transfected with plasmids for the vector genome and envelope 
was assessed by transient transfection of HEK 293T, as described previously (49, 
50). Briefly, 293T cells were seeded on 6cm dishes (1.2 x 10^/dish) and 24 hours 
later they were transfected by the calcium phosphate procedure. The medium 

25 was replaced 12 hours post-transfection and supematants were harvested 48 
hours post-transfection, filtered (0.45 |am filters) and titered by transduction of 
D17, canine osteosarcoma cells,, in the presence of 8 |ag/ml Polybrene (Sigma). 
CeUs were seeded at 0.9 x 10^ /well in 12 well plates 24 hours prior to use in 
titration assays. Dilutions of supernatant were made in complete media 

30 (DMEM/10%FBS) and 0,5ml aliquots plated out onto the D17 cells. 4 hours 
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after addition of the vector the media was supplemented with a further 1ml of 
media. Transduction was assessed by X-gal staining of cells 48 hours after 
addition of viral dilutions. 

5 The vector genomes used for these experiments were pONY4.0Z (Figure 31 and 
SEQ ID N0:9) and pONY8.0Z (Figure 32 and SEQ ID NO: 10), 

pONY4.0Z (WO 99/32646) was derived from pONY2,l IZ by replacement of the 
U3 region in the 5'LTR with the cytomegalovirus immediate early promoter 

10 (pCMV). This was carried out in such a way that the first base of the transcript 
derived from this CMV promoter corresponds to the first base of the R region. 
This manipulation results in the production of high levels of vector genome in 
transduced cells, particularly HEK 293T cells, and has been described previously 
(50). pONY4.0Z expresses all EIAV proteins except for envelope, expression of 

1 5 which is ablated by a deletion of 736nt between tiie HindlLI sites present in ewv. 

pONYS.OZ was derived from pONY4.0Z by iutroducing mutations vdiich 1) 
prevented expression of TAT by an 83nt deletion in the exon 2 of tat ) prevented 
S2 ORF expression by a 51nt deletion 3) prevented REV expression by deletion 

20 of a single base within exon 1 of rev and 4) prevented expression of the N- 
terminal portion of gag by insertion of T in ATG start codons, thereby changing 
the sequence to ATTG from ATG. With respect to the wild type EIAV sequence 
Acc. No. U01866 these correspond to deletion of nt 5234-5316 inclusive, nt 
5346-5396 inclusive and nt 5538. The insertion of T residues was after nt 526 

25 and 543. 

The results of this analysis are shown tabulated in Figure 37, and graphically in 
Figure 38. Transfections were earned out with only 3 plasmids (vector genome, 
gag/pol expression plasmid and VSV-G expression plasmid) - diagonal lined 
30 bars, or with four plasmids, which included the previous set of plasmids together 
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with an additional plasmid encoding Rev or a similar plasmid not coding a 
functional protein - filled bars. The result show that high titres of vector can be 
achieved using pESYNGP to supply EIAV Gag/Pol. The highest titres were 
obtained using the Rev-expressing vector genome plasmid, pONY4.0Z, and they 
5 were only slightly lower than observed when Gag/Pol was supplied by pONY3. 1. 
Lower titres were observed with pONYS.OZ vector genome plasmid with 
pESYNGP than with pONY3.1 . This is due to the Rev expression reqmrement of 
pONYS.OZ. Rev is expressed by pONYS.l, but not pESYNGP. These results 
confirm the utility of the codon-optimised Gag/Pol expression plasmid. 

10 

Use of the synthetic EIAV gag/pol gene in construction of cell lines which 
stably express EIAV gag/poL 

Cells lines which express high amounts of EIAV Gag/pol are required for the 
15 construction of packaging and producer cells for EIAV vectors. As a first step in 
their construction HEK 293 cells were stably transfected with pIRESlhyg 
ESYNGP (Figure 44 and SEQ ID NO: 17), in which EIAV Gag/pol expression is 
driven by a CMV promoter, and is linked to an ORE for expression of 
hygromycin phosphotransferase by an EMCV ERES. pIRESlhyg ESYNGP was 
20 made as follows. The synthetic EIAV gag/pol gene and flanknxg sequences was 
transferred firom pESYNGP into pIRESlhygro expression vector (Clontech). 
First, pESYNGP was digested with EcoRI, and the ends filled in by treatment 
with T4DNA polymerase and then digested with Notl. pIRESlhygro was 
prepared for ligation with this fiagment by digestion with Nsil, the ends trimmed 
25 flush by treatment with T4 DNA polymerase, then digested with NotL Prior to 
transfection into HEK 293 cells pIRESlhyg ESYNGP was digested with Ahdl 
which linearises the plasmid. 

Clonal cell lines were derived by serial dilution and analysed for expression of 
30 Gag/Pol by a Taqman-based product enhanced reverse transcriptase (PERT) 
assay. Data for the cell line Q3.29, which expressed the highest level of Gag/Pol 
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is shown. The analysis showed that the level of expression from the codon- 
optimised EIAV Gag/Pol cassette in Q3.29 was very similar to tiiat seen for an 
EIAV producer line, 8Z.20, in which Gag/Pol is expressed from the pEV53B 
wild type expression cassette, tiiat produced vector particles at titres of almost 10^ 
5 transducing units per ml. (Figure 45). Assuming exponential amplification 
during the assay, a difference of Ct value of 1.0 corresponds to a difference of 2- 
fold in concentration of the reverse transcriptase released from the particles. 
Therefore the difference in Gag/Pol expression between Q3.29 and 8Z.20 cells is 
approximately 2-8 fold. Furthemaore the Ct values observed indicate tiiat the 

10 level of expression of Gag/Pol is significantly higher than in samples of 
pONYSG vector particles with a titre of 2 x 10^ transducing units per ml on D17 
cells, but made by transient transfection of HEK 293T cells. These data indicate 
that the codon-optimised EIAV Gag/Pol construct can be used in the construction 
of EIAV packaging and producer lines and confirms the previous result that 

15 e3q)ression is independent of Rev expression. 

The Q3.29 cell line was then tested for its ability to support production of 
infectious vector particles when transfected with a vector genome plasmid, 
pONYS.OZ, and the VSV-G envelope expression plasmid, pRV67 and the EIAV 

20 REV expression plasmid, pESYNREV. In addition we also evaluated ttie 
performance of a plasmid pONY8.3G FB29 (-) which is modified form of the 
pONYSG vector genome plasmid. PONYSG is a standard EIAV vector genome 
used for comparison purposes. The modifications and constmction of 
pONY8.3G FB29 (-) (SEQ ID N0:19) are described in PCT/GBOO/03837 and 

25 briefly are 1) the introduction of loxP recognition sites upstream and downstream 
of the vector genome cassette 2) the placement of an expression cassette for 
codon-optimised REV, derived from pESYNREV, and driven by the FB29 U3 
promoter downstream of the vector genome cassette and orientated so that the 
direction of transcription was towards the vector genome cassette. The REV 

30 expression cassette is located upstream of the 3' loxP site. Thus the pONY8.3G 
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FB29 - plasmid carries expression cassettes for the vector genome RNA and for 
EIAVRev. 

The titres were established by limiting dilution on D17 canine osteosarcoma cells 
5 and are shown in Figure 46 

The titres obtained from transfections 2-6 were up to 4.5 x 10^ transducing units 
per ml indicating levels of Gag/Pol ejqpression su£&cient to support titres at least 
this high. The titres obtained were not higher when additional Gag/Pol was 
10 supplied (transfection 1) indicating that Gag/Pol expression was not the limitation 
on titre. 

Improved safety profile due to Gag/Pol expression from a codon-optimised 
expression construct 

15 

RCR formation takes place by recombination between different components of 
the vector system or by recombination of vector system components with 
nucleotide sequences present in the producer cells. Although recombination at 
the DNA level during construction of producer cell lines is possible (perhaps 

20 leading to msertional activation of endogenous retroelements or retroviruses) it is 
thought that recombination to produce RCR occurs mainly between RNA's 
undergoing reverse transcription, hence occurs within the mature vector particles. 
In consequence, recombination will be more likely to occur between RNA's 
which contain packaging signals, such as the vector genome and the gag/pol 

25 mRNA. Usually however the gag/pol transcript is modrfied so that it is deleted 
with respect to some or all defined packaging elements, thereby reducing the 
chances of its involvement in recombination. 

The codon-optimisation process used to create the HIV and EIAV Gag/Pol 
30 expression plasmid, pSYNGP and pESYNGP, also results in disruption of 
sequences and structures that direct packaging as a result of introducmg changes 
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at approxiinately every 3"^ nucleotide position. We have obtained evidence for 
the lower level of incorporation of the codon-optimised RNA derived from 
pESYNGP into virions. 

5 The packaging of mRNA's derived from a wild type gag/pol pEV53B expression 
cassette, and from the codon-optimised EIAV gag/pol expression cassette, 
pESYNGP, was compared. Medium was collected from a HEK 293 based cell- 
lines which were stably transfected with either pEV53B (cell line B-241), or with 
pESYNGP. Both cell lines produce vector particles which do not contain vector 

10 RNA and do not have envelopes. In some experiments, an EIAV vector genome 
plasmid (pECG3-CZW) was transfected mto the cells to serve as an internal 
positive control for hybridisation and for the presence of particles capable of 
packaging RNA. pECG3-CZW is a derivative of pEC-LacZ (WO 98/51810) and 
was made from the latter by 1) reduction of gag sequences so that only the first 

15 20Qnt of gag, rather than the first 577nt, was mcluded and 2) inclusion of the 
woodchuck hepatitis virus post-transcriptional regulatory element (WHV PRE) 
(corresponding to nt 90M800 of Acc. No. J04514) into the NotI site downstream 
of the LacZ reporter gene. 

20 Viral particles derived from each of the cell lines were then partially purified 
from the medium by equilibrium density gradient centrifugation. To do this 10 
ml of medium from producer cells, harvested at 24 hours after induction with 
sodium butyrate, was layered onto a 20-60% (w/w) sucrose gradient in TNE 
, . buffer (pH 7.4) and centrifuged for 24 hours at 25,000 rpm and 4' C in a SW28 

25 rotor. Fractions were collected from the bottom and 10 ^1 of each fraction 
assayed for reverse transcriptase activity to locate viral particles. The results of 
this analysis are shown in (Figure 47) where the profile of reverse transcriptase 
activity is shown as a function of gradient fraction. In these figures, the top of 
the gradient is on the right It should be noted that the levels of RT activity from 

30 the pESYNGP-expressing cell were significantly lower than from pEV53B 
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expressing cells. To deteimine the RNA content of the purified virions, aliquots 
from the top, middle or bottom fractions were pooled (as indicated by the bars 
labeled T, M and B) and the RNA from each fraction was subjected to slot-blot 
hybridization analysis. Usmg a probe specific for a common region of wild type 
5 and synthetic gag/pol, encapsidation of RNA was easily detectable in the peak 
firactions (M) of virions synthesized firom the wild type construct (pEV53B), but 
was not detected from virions synthesized from the synthetic Gag/Pol construct 
(pESYNGP)(Figure 48). The control for the presence of capsid capable of 
carrying out encapsidation was the EIAV G3-CZW vector genome which was 

10 readily detected ui peak fractions from cells expressing either the wild type or 
synthetic gag/pol proteins. Even taking into account the different levels of 
expression fix)m flie wild type and synthetic Gag/Pol expression constructs this 
result indicates that the RNA from the codon-optimised gag/pol gene is packaged 
significantly less efificienfly than the wild type gene and represents a significant 

1 5 improvement to the safety profile of the system. Of fiirther note is that the RNA 
transcribed from pEV53B was packaged. This RNA is deleted with respect to 
sequences upstream of the splice donor sequence (CAG/GTAAG) and yet was 
still packaged. This points to the localisation of major packaging determinants 
within the gag coding region and is in contrast to the collected observations on 

20 the location of the pack^ing signal of HIV-1 . 

In additional experiments we have shown that the packaging of transcripts from 
pEV53B is only slightly lower than firom pEV53A (Figure 51). This indicates 
fiirther that major packaging sequences are located within the gag coding region. 

25 In these experiments cell line B-241 expressed pEV53B RNA and PEV-17 
expressed pEV53A RNA. The EIAV vector genome used to confirm the 
presence of packagmg competent vector particles was G3-CZR, which is the 
same as G3-CZW, described above, except for the replacement of the woodchuck 
post-transcriptional regulatory element with a sequence containing the EIAV 

30 RRE elements. Methodology was as described above. 
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All publications mentioned in the above specification are herein incorporated by 
reference. Various modifications and variations of the described methods and 
system of the invention will be apparent to those skilled in the art without departing 
5 firom the scope and spirit of the mvention. Although the invention has been 
described in connection with specific preferred embodiments, it should be 
understood that the invention as claimed should not be unduly limited to such 
specific embodiments. Indeed, various modifications of the described modes for 
carrying out the invention which are obvious to those skilled in molecular biology 
10 or related fields are intended to be within the scope of the following claims. 
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CLAIMS 

1. Use of a nucleotide sequence coding for retroviral gag and pol proteins 
capable of assembly of a retroviral vector genome into a retroviral particle in a 
producer cell to generate a replication defective retrovirus in a target cell, 

S wherein the nucleotide sequence is codon optimised for expression in the 
producer cell. 

2. Use of a nucleotide sequence coding for retroviral gag and pol proteins 
capable of assembly of a retroviral vector genome into a retroviral particle in a 
producer cell to prevent packaging of the retroviral vector genome in a target cell, 

10 wherein the nucleotide sequence is codon optimised for expression in the 
producer cell. 

3. Use of a nucleotide sequence coding for retroviral gag and pol proteins 
capable of assembly of a retroviral vector genome comprising at least part of a 

15 gag nucleotide sequence into a retroviral particle in a producer cell to prevent 
recombination between said nucleotide sequence coding for retroviral gag and 
pol proteins and the at least part of a gag nucleotide sequence, wherein the 
nucleotide sequence coding for retroviral gag and pol proteins is codon optimised 
for expression in the producer cell. 

20 

4. A use according to any preceding claim wherein the retroviral genome 
further comprises a nucleotide of interest (NOI). 

5. A use according to any preceding claim wherein the retroviral particle is a 
lentiviral particle. 

6. A use according to claim 5, wherein the retroviral particle is substantially 
derived ftom HIV-1. 
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7. A use according to claim 6, wherein the codon optimised nucleotide 
sequence has the sequence shown in SEQ ID NO: 15. 

8. A use according to claim 5, wherein the retroviral particle is substantially 
derived j&omEIAV. 

9. A use according to claim 8, wherein the codon optimised nucleotide 
sequence has the sequence shown in SEQ ID NO: 16. 

10. A method of producing a replication defective retrovirus comprising 
transfecting a producer cell with the following: 

i) a retroviral genome; 

ii) a nucleotide sequence coding for retroviral gag and pol proteins; 
and 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by the nucleotide sequence of (ii); 

characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer cell. 

11. A method of preventing packaging of a retroviral genome in a target cell 
5 comprising the steps of: 

a. transfecting a producer cell with the following to produce 
retroviral particles: 

i) a retroviral genome; 

ii) a nucleotide sequence coding for retrovkal gag and pol proteins; 

and 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by one or more of the nucleotide 
sequences of (ii); and 

• b. transfecting a target cell with retroviral particles of step (a); 
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characterised in that the nucleotide sequence coding for retroviral gag and pel 
proteins is codon optimised for expression in the producer cell. 

12. A method to prevent recombination between a retroviral vector genome 
and a nucleotide sequence encoding a viral polypeptide required for the assembly 
of the viral genome into retroviral particles comprising transfecting a producer 
cell with the following: 

(i) a retroviral genome comprising at least part of a gag nucleotide 
sequence; 

(ii) a nucleotide sequence coding for retroviral gag and pol proteins; 

and 

(iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by the nucleotide sequence of (ii); 

characterised in that the nucleotide sequence coding for retroviral gag and pol 
proteins is codon optimised for expression in the producer celL 

13. A method according to any one of claims 10 to 12 wherein the retroviral 
genome further comprises a nucleotide of interest (HOT). 

14. A method according to any one of claims 10 to 13 wherein the retroviral 
particle is a lentiviral particle. 

15. A method according to claim 14, wherein the retroviral particle is 
substantially derived from HIV-1 . 

16. A method according to claim 15, wherein the codon optimised nucleotide 
sequence has the sequence shown in SEQ ID NO: 15. 

17. A method according to claim 14, wherein the retroviral particle is 
substantially derived from EIAV. 
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18. A method according to claim 17, wherein the codon optimised nucleotide 
sequence has the sequence shown in SEQ ED NO: 16. 

19. A method according to any one of claims 10 to 18, wherein iii) comprises 
a nucleotide sequence coding for an env protein. 

20. A nucleotide sequence coding for retroviral gag and pol proteins having 
the sequence of SEQ. ID. No. 15 or 16. 

21. A method according to any one of claims 10 to 20 wherein at least one of 
i) to iii) contains one or more functional accessory genes. 

22. A method according to any one of claims 10 to 20 wherein i) to iii) are 
devoid of any functional accessory genes. 

23 . A viral vector system comprising: 

i) a nucleotide sequence of interest; and 

ii) a nucleotide sequence encoding a viral polypeptide required for 
the assembly of viral particles wherein the nucleotide sequence is as defined m 
claim 20. 

24. A viral production system comprising: 

i) a viral genome comprising at least one nucleotide sequence of 
interest; and 

ii) a nucleotide sequence encoding a viral polypeptide required for 
the assembly of the viral genome into viral particles wherein the nucleotide 
sequence is as defined in claim 20. 
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25. A system according to claim 23 or claim 24 wherein the viral vector is a 
retroviral vector. 

26. A system according to claim 25 wherein the retroviral vector is a 
lentiviral vector. 

27. A system according to any one of claims 23 to 26 wherein the lentiviral 
vector is substantially derived ftom HTV-l or EIAV. 

28. A system according to any of claims 23 to 27 wherein the nucleotide 
sequence defined i-ii) also includes an envelope protein. 

29. A system accordit]^ to claim 28 wherein the envelope gene is codon 
optimised. 

30. A system according to any of claims 23 to 29 wherein the nucleotide of 
interest is selected firom a therapeutic gene, a marker gene and a selection gene. 

31. A sysytem according to any one of claims 23 to 30 comprising one or 
more functional accessory genes. 

' 32. A system according to any of claims 23 to 30 devoid of any functional 
accessory genes. 

33. A viral system according to any one of claims 23 to 32 for use in a 
method of producing viral particles. 

34. A method for producing a viral particle which method comprises 
introducing into a producer cell: 

i) a viral genome as defined in any one of claims 24 to 33, 
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ii) one or more nucleotide sequences as defined in claim 20 and, 

iii) nucleotide sequences encoding other essential viral packaging 
components not encoded by one or more of the nucleotide sequences of 

(ii). 

35. A viral particle produced by the production system of any one of claims 
24 to 33 or by the method of claim 34. 

36. A viral system according to any one of claims 23 to 33, or a viral particle 
according to claim 35, for treating a viral infection. 

37. A pharmaceutical composition comprismg the viral system of any one of 
claims 23 to 33, or the viral particle of claim 35, together with a pharmaceutically 
acceptable carrier or diluent. 
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FIGURE 2 
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FIGURE 3a 

^agpol-SYNgp H to 4308] -> Codon Usage 

DNA sequence 4308 b.p. ATGGGCGCCCGC GATGAGGATTAG. linear 
1436 codons 



ISW : 161929 Dalton CAI(S.c.) : 0,080 CAl(E.c.) : 0.296 



TTT phe 


F 


5 


TCT 


ser S 


5 


TAT tyr Y 


10 


T6T cys C 


6 


TTC pho 


F 


30 


TCC 


ser S 


11 


TAG tyr Y 


29 


TGC cys C 


14 


TTA leu 


L 


2 


TCA 


ser S 


4 


TAA OCH Z 




TGA CPA Z 




TTG leu 


L 


7 


TCG 


ser S 


6 


TAG AMB Z 


.1 


TOG trp W • 


37 


CTT leu 


L 


3 


CCT 


pro P 


14 


CAT his H 


6 


CGT arg'R 


2 


CTC leu 


L 


22 


CCC 


pro P 


39 


CAC his H 


21 


CGC arg R 


34 


CTA leu 


Ii 


6 


CCA 


pro P 


10 


CAA gin Q 


14 


CGA arg R 


3 


CTG leu 


L 


.70 


CCG 


pro P 


13 


CAG gin Q 


81 


C66 arg R 


. 10 


ATT lie 


I 


17 


ACT 


thr T 


11 


AAT asn N 


13 


AGT ser S 


7 


ATC lie 


I 


79 


ACC 


thr T 


48 


AAC asn N 


45 


AGC ser S 


27 


ATA lie 


I 


4 


ACA 


thr T 


13 


AAA lys K 


25 


AGA arg R 


7 


ATG met 


H 


29 


ACG 


thr T 


16 


AAG lys K 


97 


AGG aorg R 


13 


GTT val 


V 


5 


OCT 


ala A 


15 


GAT asp D 


19 


GOT gly G 


10 


GTC val 


V 


27 


6CC 


ala A 


56 


GAC asp D 


44 


GGC gly G 


54. 


GTA val 


V 


6 


GCA 


ala A 


13 


GAA glu E ' 


2? 


GGA gly O 


16 


GTG val 


V 


58 


GCG 


caa A 


12 


GAG glu E 


78 


GGG gly G 


28 



3/59 



wo 01/79518 



PCT/GBOl/01784 



Codon usage in human genes (MH), wild type HIV-1 Gag-pol 
(WT) and the codon optimised HIV-1 Gag-pol (CO) 
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AG 


G 


18 


29 


11 


Glu 


A 


25 


65 


38 






















CG 


A 


6 


6 


0 


GA 


G 


75 


35 


62 


Lys 


A 


18 


58 


28 


Thr 


A 


14 


45 


16 




C 


37 


0 


61 












AA 


6 


82 


42 


72 


AC 


C 


57 


29 


52 




G 


21 


6 


10 


Gly 


A 


14 


53 


21 














G 


IS 


0 


19 




T 


7 


0 


5 


GG 


C 


50 


21 


55 


Phe 


C 


80 


45 


45 




T 


14 


26 


13 














G 


24 


24 


24 


TT 


T 


20 


55 


56 












Asn 


C 


78 


29 


71 




T 


12 


3 


0 












Tyr 


C 


74 


20 


80 


AA 


T 


22 


71 


29 












Pro 


A 


16 


52 


24 


TA 


T 


26 


80 


20 












Hi8 


C 


79 


30 


90 


CC 


C 


48 


15 


39 












Asp 


C 


75 


64 


70 


CA 


T 


21 


70 


10 




G 


17 


3 


21 


Val 


A 


5 


56 


4 


GA 


T 


25 


36 


30 














T 


19 


30 


15 


GT 


C 


25 


6 


20 












lie 
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58 
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G 


64 


24 


76 












AT 


C 


18 


19 


92 
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12 


0 



T 77 23 0 



Figure 3b 
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FIGURE 4 



env-mn [1 to 2571] -> Codoa Usage 

DNA sequence 2571 b. p. ATGAGAGTGAAG ... GCTTTOCTATAA linear 

857 codons 



MW : 97078 Dalton CAI(S.c.) : 0.083 CAI{E.c.) : 0.140 



TTT phe 


F 


13 


TCT 


ser 


s 


7 


TAT tyr Y 


15 


TGT cys 


C 


16 


TTC phe 


P 


11 


TCC 


ser 


s 


3 


TAG tyr Y 


7 


TGC cys 


C 


5 


TTA leu 


L 


20 


TCA 


ser 


s 


13 


TAA OCK Z 


1 


TGA OPA 


z 




TT6 leu 


L 


17 


TCO 


ser 


s 


2 


TAG AMB Z 




TGG trp 


w 


30 


CTT leu 


h 


9 


CCT 


pro 


p 


5 


CAT his H 


8 


CGT arg 


R 




CTC leu 


L 


11 


CCC 


pro 


p 


9 


CAC his H 


6 


CGC -arg 


R 


2 


CTA leu 


L 


12 


CCA 


pro 


p 


12 


CAA gin Q 


22 


CGA eurg 


R 


1 


CTG leu 


L 


15 


CC6 


pro 


p 


2 


CA0 gin Q 


19 


CG6 arg 


R 


1 


ATT lie 


i 


21 


ACT 


thr 


T 


16 


AAT .asn N 


50 


AGT ser 


S 


18 


ATC lie 


I 


10 


ACC 


thr 


T 


14 


AAC asn N 


13 


AGC ser 


S 


11 


ATA ile 


I 


32 


ACA 


thr 


T 


28 


AAA lyp K 


32 


AGA arg 


R 


30 


ATG met 


M 


17 


ACG 


thr 


T 


5 


AA6 T.ys K 


14 


AGG arg 


R 


15 


GTT val 


V 


8 


GCT 


ala 


A 


16 


GAT asp D 


18 


GGT gly 


G 


10 


GTC val 


V 


9 


<5CC 


ala 


A 


1 


GAG asp D 


14 


GGC gly 


G 


6 


GTA val 


V 


26 


6CA 


ala 


A 


20 


GAA glu E 


36 


GGA gly 


G 


28 


GTG val 


V 


12 


GCG 


ala 


A 


5 


GAG glu B 


10 


QGG gly 


G 


12 
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FIGURES 

SYNgplfiOmn -> Codon Usage 

DNA sequence 2571 b.p. ATGAGGGTGAAG ... GCGCTGCTGTAA linear 



857 codons 



MW : 97078 Dal ton CAI{S.c.) 


: 0.074 CAI(E.C.) : 0.419 




TTT phe P 




TCP ser S 


2 


TAT tyr Y 


1 


TGT cys C 




TTC phe P 


24 


TCC ser S 


4 


TAC tyr Y 


21 


TGC cys C 


21 


TTA leu L 




TCA ser S 




TAA OCH Z 


1 


TGA OPA Z 




TT6 leu Ii 




TCG ser S 




TAG AMB Z 




TGG trp W 


30 


CTT leu ii 




CCT pro P 




CAT his H 


2 


CGT arg R 


1 


CTC leu L 


20 


CCC pro P 


26 


CAC his H , 


12 


C6C arg R 


36 


CTA leu L 


1 


CCA pro P 




CAA gin Q 




CGA arg R 




CT6 leu L 


63 


CCG pro P 


2 


CAG gin Q 


41 


CGQ. arg R 


4 


ATT ile I 


2 


ACT thr T 




AAT asn M 


2 


AGT ser S 




ATC ile I 


61 


ACC thr T 


59 


AAC asn N 


61 


AGC ser S 


48 


ATA ile I 




ACA thr T 




AAA lys K 


1 


AGA arg R 


2 


ATG met M 


17 


ACG thr T 


4 


AA6 lys K 


45 


A86 arg R 


6 


GTT val V 




• GCT ala- A 




GAT asp D 


2 


G6T gly 6 


X 


GTC val V 


1 


GCC ala A 


40 


GAG asp D 


30 


GGC gly 6 


47 


GTA val V 


1 


GCA ala A 




GAA glu E 


3 


GGA gly 6. 




GTG val V 


53 


6C6 ala A 


8 


GAG glu S 


43 


G6G gly G 


8 



6/59 



wo 01/79518 



PCT/GBOl/01784 



FIGURE 6 



mv Constructs 



pSYNgpl 



gag 



pA 



pRV67 



VSV-G 



pA 
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ATGGGAGACCCTTTGACAT<K3AGCAAGGCGCTCAAGAAGTTi^ 
ATGGGCGATCCCCTCACCTGGTCaUWlGCCtntS;^ 



GGGTCTCAGAAATTAACTACTGGTAACTGTAATTGGGC^ 120 
GGTAGCCaAAAGCTTACCAOUSGCAATTGCaACT^ 120 



CATGATACCS^CTTTGTAAAAGAAAAGGACTGG^ ISQ 
avaSACACTAATTTaSTTAAGGAQAAAGATTGGa^CTO^ 180 



GAAGATGTAACTOWSAOSCTGTCaGGACAAGAAAGAGAGGC 240 
GA6GACX3TGACCC3^a^TTCTCTGGaaM3GA^ 240 



GCAATTTCTGCTGTAAAGATGGGCCTCCAGATTAATAATGT^ 300* 
GCCATOUSCGCAGTCAAAATGGGGCTGCaAATCAAC^ 30o 



TTCCAGCTCCTAAGAGOGAAATATGAAAAGAAQAOTSCTAArAAAAA^ 360 

TTTCAACrGCTCaXXOTAAGTAaSAGAA^ 360 

TCTGAAGAATATCCaATCATGAlMATGGGGCTGGAAACA^^ 420 

AGOGAGGAGTACCOATTATGATOSACXSGC^ 42o 

CCTAGAGGATATACTACTTGGGTGAATACO^TAO^^ 480 

CCCAGGGGCTATACaiCCTGGGTCAACACa^TCCAGAa^ 480 



AGTCAAAAOTATTTGGGATATTATC3«3TAGACTGTACT^ 54 q 

TCCCAGAACCTGTTOGGCATCCTGTOTGTGGACTGCACCTCaS^^ 54O 



TTGGATGTGGTACCTGGCCAGGCAGQACAAAAGCAGATATTAC^^ 600 

CraSACGTGGTGCa^GGAOWSGCTGGACAGAAAC^ 600 

ATAGCAGATGATTQGGRTAATJ^CATCCaTTACCGaATGCrcCAC^ 6G0 

ATOGCCGACmCTGGGATAATCGCCACCCCCTGCCAaACGCCXCTCTOGTC 660 

CAftGGGCCTATTCCCATQA(MCAAGGTTTATTAGAGGTTTAG^ 720 

CAGGGGCCTATCCCPATGACXXSCTAGGTTCATTAGGGQACT^^ 720 



CatSATGGAGCCTGCTTTTCaTCaWSTTTAGOai^ 780. 

CftGATGGAGCX3U3CATTT(aCCAATTTA6GCaGAar^ 78o 

^'^'^S^^^^'^^^^'^'^^'^:'^^ 840 

ATOAGCGAGGG(aTTAAaOTC»T<aTCGQAAaGCCCSU«3QCAC^ 840 



GCTAAGGRACCTTACCCftQAATTTOTAGACAGACTATTATCCCSUUiTAAAAAGTGA 900 
GCCAAQaaACXATACCCTGA6TTTGTaaCAGGCTTCTGTCCa«3ATTAA^ 9Q0 



CATCCAOVAGAfiATTTCAAAATTCTTGACTGATACACTGACTATTCAGAAasCAAAT^ 960 ' 
CACCCTCAGGAGATCTCCAAGTTCrrGACAGACACACTGACIATCCaAAATGC^ 960 
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Wr GAATGTAGAAATGCTATGAGACATTTAAGACCAGAGGATACATTAGAAGAGAA^ 1020 

CO GAGTGaU3AAACGCCaTGA(3GCACCTCaGACCTO^^ 1020 

WT GCTTGCAGAGACATTGGAACTACAAAAaiWiAGAT^ 1080 

CO GCATGTCGaSACATTGGOVCrACauwS^^ 1080 

WT ACS^TCTTGCSGQCCCATTTaAAGGl^^ 1140 

CO ACCG6CCTGGCTGGTCCATTCRfiAGGaGGAGCACTGAAGGG3«^ 1140 

WT C3VAACSVTGTTATAACTGT6GGAAGCCAGGACATTTATCTaGTCAATGTAGA(^^ 1200 

CO CAAACATGTTATAATTGTGGGAAGCCAGGACATTTATCTAGTCaAIGTAjGAGC^ 1200 

WT GTCTGTTTTAAATGTA71ACACK:CTGGACATTTCTCAAAGC3UVT^ 1260 

CO GTCTGTTTTAAATGTAAAGAGCCTGGACATTTCTCAAAGCAATGCAC^^ 1260 

WT AACXSGGAAGCAAGGGGCTCAAGGGAGGCCCCAGAAACAAACTTTCCCGATACAACA^ 1320 

CO AAC3GGGAAGCAAGGGGCTCAAGGGAGGCCCCAGAAAC3UACTTTCCX^TACAACA^ 1320 

WT AGTCAGCACAACAAATCTGTTGTACAAGAQACTCCTCAGACTCAAAATCTGTACCCAGA^ 1380 

00 A3TC3\GCACAACAAATCTGTTGTACARGRfiACrCCTCaGACTCAARATCTO 1380 

WT CTGAOOGAAATAAAAAAGGAAXACAATOTCAA66AGAAGGATCAAGTAGA6 ' 1440 

CO CTGAGCX3AAATAAAAAAOGAATACAAroTCAA6GA0AA6^ 1440 

WT CTGGACMTTTGTGGGAGTAAavrATAATCTAGAGAAAASGCCTACTACAATi^^ 1500 " 

00 CTGGACMTTTGTGGGAGTAACATACAATCTCQAGaAGAGGCCCACTACCATOG 1500 

WT TTAATGATACTCCCTTAAATGTACTGTTAGACACAGGAGCAGATACTTCAGTGTT^ 1560 

00 TCAATGACACCCCTCrrAATGTGCTGCTQGACACaaGAGCCGACACCAGCQTTCTCACT^ 1560 

WT CTGCACArrATAATAGGTTAAAATATAGAGGGAGAAAATATCAAGGGa^^ 1620 

CO CTGCTCACiaTAACaGaCTGAAATACAGaGGAAGGAAATACCAGGGCACSU^ 1620 

WT GWSTGGGASCaAARTGTGGAAACATTTTCTAOGCCTGTGACTATA 1680 

CO GOGTTGGAGGCAACGTCGAAACCTTTTOCACrCCTQTCACCATCAl^^ 1680 

WT ACATTAAGACAAGAATGCTAGTGG<aM3ATATTCCAGTGa£^ 1740 

CO ACATTAAAACCAQAATGCTGGTCGCCGACATCCCCfiTGACCATCCTTaSCaGAGACAT^ 1740 

WT TTCAGGACTTAGGTGCAAAATTGQTTTTGGCACAGCTCTCCAAGGAAATAAAATTT^^ 1800 

00 TCCAGGACCTGGGCQCTAAACTCGTGCTGGCACAACTGTCTAAGGAAATCAAGTTCCGC^ 1800 

WT AAATAGAGTTAAAAGAGGGCACRATGGQGCCAAAAATTCCTCAATGGCCACTCACTAAGQ i860 

CO AGATCX3AGCTGAAAGAGQQCACAATGGGTCCAAAAATCCCCCAGTGGCCCCTGACCAAAG 1860 

WT AGAAACTABAAGGGGCC3\AAGAGATAGTCCA;AGACTATTGTCAGAGGQAAAAATATCAG 1920 

CO AGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGOGCCTGCTTTCTGAGGGCAAGATTAGCG 1920 

WT AAGCTAGTGACAATAATCCTTATAATTCACCCATATTTGTAATAAAAAAGAGGTCTGGCA 1980 

GO AGGCCAGCGACAATAACCCTTACAACAGCCCCATCTTTGTGATTAAGAAAAGGAGCGGCA 1980 
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WT i^TGC3A6GTTATXACaAGATCTGAGAGAATTAAAC7U^ 2040 

CO AAOXSGAGACTCCTGCaGGACCnXSAGGGT^CT 2040 

WT TATCCAGAGGATTGCCTCACCCXSGGAGGATTiU^TTAAATGTAAACAa 2100 

CO TCTCTCGCGGACTGCCTCACCCa^GOGGCCTGATTAAATGCAAGGAC^ -2100 

WT ATATTGGAGATGCATATTTCACTATACCCTTAaATCCAGAGTTTAGACCATATACA^^ 2160 

00 AGATTGGAGAOGCTTATTTTACCATCCCCCTCGATCCTGAATTTCGCCCCTATACTGCOT 2160 

WT TCACTATTCCCTCCATTAATCATCAAGJ^CCAGATAAAAGATATGTGTGGAAATGTTTAC 2220 

CO TTACCATCCCCAGCATCAATCACCAGGAGCCCGATAAACGCTATGTGTGGAAGTGCCTCC 2220 

WT CACAAGGATTCGTGTTGAGCCCATATATATATCAGAAAAGATTACAGGSiAAT^^ 2280 

CO CCCAGGGATTTGTGCrrAGCCCCTACATTTACCAGAAGACACOT 2280 

WT CTTT TAGGGAAaGAT^TCCTQRAGTACftATTGTATC^ 2340 

CO CTTTCCGaSAAAQATACCCAGASGTTCAACrCTACGAATATATGQACGAC^ 2340 

WT GAAGTAATGGTTCTAARAAACAACACAAAGAGTTAATCATAGAATTAAGGGCGATCTTAC 2400 

CO GGTCCAACGGGTCTAAQAAGCASCACAAGGAACTCATCATCGAACTQAGGGC^ 2400 

WT TGGAAAAGGGTTTTSAQAaiCCaGATGATAAATTACAA^^ 2460 

CO TGGAGAAAGGCrrCGAGACACCCGACaACAAGCTGaUVGAAGTTCCT^ 2460 

WT XAGGTTATCAACTTTGTCCTaAAAATTGGAAAGTACAAaAaATGCaAra^^ 2520 ' 

CO TOGOCTACaGCTTT6CCCTGAAAACTGGAAAGTCCAGAAQATGCMTT<^^ 2520 

WT AGAATCCAACCCTTAATGATGTGCAAfiAATTAATGGGGAATATAACATGGATGAGCTCA^ 2580 

CO AGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGCAATATTACCTGGATGAGCTCCG 2580 

WT GGATCCCAGGGTTGACAGTAAAACACATTGCAGCTACTACTAAGGGATGTTTAGAGTTG^ 2640 

CO GAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACTACAAAAGG^ 2640 

WT ATCAAAAAGTAATTTGGACGGAAGAGGCACAAAAAGAGTTAC3AAaAAAATAATG 2700 

C^ ACCAGTW^GGTCATTTGGACAGAGGAAGCTCAGAAGGAACTGGAGGAGAATAATGAAAAGA 2700 

WT TTAAAAATGCTCAAGGGTTACAATXTTATAATCCAGAAGAAGAAATGTTATGTGAG^^ 2760 

CO WAAOAATOCTCAAGGGCTCCAATACTACAATCCCaAAGAAGAAATGTTGTGaSMG^^ 2760 

WT AAATTACAAAAAATTATGAGGCAACTTATGTTATAAAACAATCACAAGGAATCCTATGGG 2820 

CO AAATCACTAAGAACTACGAAGCCACCTATGTCATCT^CAGTCCCAASGCATCTTQTGGG 2820 

WT CAGGTAAAAAGATTATGAAGGCTAATAAGGGATGGTCAACAGTAAAAAATTTAATGTTAT 2880 

CO CC6GAAAQAAAATCATGAAGGCCAACAAAGGCTGGTCCACCGTTAAAAATCTGATGCTCC 2880 

WT TGTTGCAACATGTGGCAACACSAAASTATTACTAGAGTAGGAAAATGTCCAAC^ 2940 

CO TCCTCCAGCACGTCGCCACCQAGTCTATCACCCGaSTCGGCAAGTGCCCCACCTTCaU^ 2940 

WT TACCATTTACCAAAGAGCAAGTAATGTGGGAAATGCAAAAAGGATGGTATTATTCTTGGC 3000 

C^ TTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGCAAAAAGGCTGGTACTACTCTTGGC 3000 
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WT TCCCAGAAATAGTATATACACATCaAGTMTTCATGAT^ 3060 

CO TTCCCXSAGATCGTCTACACCaVCaUVBTGGTGaVCGACmCTGGA^ 3060 

WT AAGAACCTACATO^AATAACAATATACACTGATGGGGGAAAAC^^ 3120 

CO AGGAGCCCACTAGCGGAATTACAATCTATACCGACGGCGGAAAGCAAT^ 3120 

WT TAGCAGCTTATGTGACCAGTAATGGGAGAACTAAACAGAAAAGGT^ 3180 

CO TCGCTGCATACGTCACATCTAACGGCCGCACCAAGCATU^AGA^ 3180 

WT ATCAAGTTGCTGAAAGAATGGCAATACAAATGGCATTAGAGGATACCAGAGATAAAC^ 3240 

CO ACOUSGTGGCTGAGAGGATGGCTATCCAGATGGCCCTTGAGGACACTAGASACAAGC^ 3240 

WT TAAATATAGTAACTGATAGTTATTATTGTTGGAAAAATATTACAGAAGGATTAGGTTT^ 3300 

CO TGAACATTGTGACTGACAGCTACTACrGCTGGAAAAACATCACAGAGGGCCTTG^ 3300 

WT AAGGACGACAAAGTCCTTGGTOGCCTATAATACAAT^TATACGaGAAAAAGa^ 3360 

CO AGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGAATATCCGCGAAAAGGAAATTGTC^ 3360 

WT ATTTTGCTTGGGTACCTGGTCACAAAGGGATATATGGTAATC^ 3420 

CO ATTrrCGCCTGGQTGCCTGGACACAAAGGAATTTACGGCAACCAACTCGCGGAT^^ 3420 

WT CAAAAATAAAAGAAGAAATCATGCTAGCATACCAAGGCACA^ * 3480 

CO CCAAAATTAAAGJVSGAAATCATGCTTGCCTACCAGOGCACACAaATTAAGGS^^ 3480 

WT ATGAAGATGCAGGGTTTQACTTAteTGTTCCTTATGACATCATGATACCTGTATCT^^ 3540 ■ 

CO AOaAQGACGCTGGCTTTGACCTGTGTCTGCCATACGACATCAT^^ 3540 

WT CAAAAATCATACCCACAGATGTAAAAATTCAAGTTCCTCCTAATAGCTTTGGATGGGTC^ 3600 

CO CAAAGATCATTCCAACCGATQTCAAGATCCAGGTGCCACCCAATTCATTTro 3600 

WT CTGGGAAATCATCAATGGCAAAACAGGGGTTATTAATTAATGGAGGAATAATTGATGAAG 3660 

CO CCGGAAAGTCCAGG^iTGGCTAAGa^GTCTTCrGATTAACG^ 3660 

WT ■ GATATACAGGAGAAATACAAGTGATATGTACTAATATTGGAAAAAGTAATATD^TTAA 3720 

CO GATACACCGGCGAAATCCAGGTGATCTGCACAAATATaSGCAAAAGCAATATT^ 3720 

WT TAGAGGGACAAAAATTTGCACAATTAATTATT^CTACAGCATCACT 3780 

CO TOSAAGGGCAGaaQTTOGCrCRACTCATCATCCrCCaGCACCACA^ 3780 

WT CTTGGQATGRAAATAAAATATCTCAGAGAGGGQATAAaGGATTTGGAAGTAm^ 3840 

CO CTTGGGACQtfU^AACAA6ATTAGCC3«?AGAGGTGACAAGGGCTTCG^ 3840 

WT TCTGGGTAGAAAATATTCAGGAAGCACAAGATGAACATGAGAATTGGCATACATCACCAA 3900 

CO TCTGGGTGGAGAACATCCAGGAAGCACAGGACGAGCAa3AGAATTGGCACAC^ 3900 

WT AGATATTGGCAAGAAATTATAAGATACCATTGACTGTAGCAAAACAGATAACTCAAGAAT 3960 

CO i^TTTTGGCCCGCAATTACAAGATCCCACTGACTGTGGCTAAGCAGATCACAa 3960 

WT GTCCTCATTGCACTAAGCAABGATCAGGACCTGCAGGTTGTGTCATGMATCTCCT^^ 4020 

CO GCCCCCACTGCACCAAACJU«3GTTCTGGCCCa3Ca3GCTGCGTGATGAGGTCCCCCA^ 4020 
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WT ATTGGa^CAGATTGCACACATTTGGACAATA^^ 4080 

CO ACTGGC3W3GC3U5ATTGCACCCACCTa3AC^ 4080 

WT ATTCAGGATACATACATGCTACATTATTGTa^7^GAAAATGC3VTTATG 4140 

CO ATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAAAATGCATTGTGCACCTCCC^ 4140 

WT CTATTTTAGAATGGGCAAGATTGTTTTCACCAAAGTCCTTACACACAGATAACGGa^ 4200 

CO CAATTCTGGAATGGGCCMGCTGTTCTCTCCAAAATC^^ 4200 

WT ATTTTGTGGCAfiAACCAGTTGTAAATTTGTTGAAGTTCCTAAAGATAGCACATACCAa 4260 

CO ACTTTGTGGCTGAACCTGTGGT6AATCTGCTGAAGTTCCTGAAAATCGCCCACACCACTG 4260 

WT GAATACCATATCATCCAGAAAGTCAGGGTATTGTAGAAAGGGCAAATAGGACCTTGA^ 4320 

CO GCATTCCCTATCACCCTGAAAGCCAGGGCATTGTCGAaAGGGCCAACAGAACTCTGA^ 4320 

WT AfiAAGATTC3UUVGTCATAfiW3ACAACACTCAAACACTGGAGGCAGC^ 4380 

00 AAAAGATCCAATCTCaCAGAGACaATACACAGACATTGGAGGCCGCACTTC^ 4380 

WT TCATTACTTGTAACAWVSGQRGGGAAAGTATGGGAGGACAGACACCATGG^^ 4440 

CO TTATCACCTGCAACAAAGGAAGAGAAAGCATGGGOGGCCAGACCGCCTGG^ 4440 

• ( 

WT TCACTAATCAAGCACAAGTAAaa^GATGAGAAACTTTTACTACAGCAAGCACA^rCCTC 4500 

CO TCACTAACCAGGCCCAGGTCATCCATGAAAAOCTGCTCTTGCAQCAGGCCCAOT 4500 

WT AAAAATTTrGTTTTTACAAAATCCCTGGTGAACATGATTGGAAGGQACCTAC^ 4560 * 

CO AAZ^TTCTGCTTTTATAAOATCCCCGOTGAGGACGACI^^ 4560 

WT T0TGQAA6GGTGAT66TGCA6TA6TA0TTAATGATGAAGQAAA 4620 

GO TGTGGAAAGQAOAOGOOKUVQTTGTGGTGAACGATGIUS^^ 4620 

WT CATTAACCAGGACTAAGTTACTAATAAAACCAAATTGA ' 4658 

CO CCCTGACACGCACCAAGCTTCTCATCAAGCCAAACTGA 4658 
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FIGURE 39 
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FIGURE 40 
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SEOUENCF. LISTING PART OF THE DESCRIPTION 



SEQ. ID. NO. I - wad type gagpol sequence for strain HXB2 (accession no. K03.455) 

ATGGGTGCGA GAGCGTCAGT ATTAAGCG6G GGAGAATTAG ATCGATGGGA AAAAATTCGG 60 
HAAGGCCAG GGGGAAAGAA AAAATATAAA HAAAACATA TAGTATGGGC AAGCA6GGAG 120 
CTAGAACGAT TC6CAGTTAA TCCTGGCCTG TTAGAAACAT CA6AAGGCTG TA6ACAAATA 180 
CTG6GACAGC TACAACCATC CCTTCAGACA G6ATCAGAAG AACTTAGATC AHATATAAT 240 
ACAGTAGCAA CCCTCTATTG TGTGCATCAA AGGATAGAGA TAAAAGACAC CAAGGAA6CT 300 
HAGACAAGA TAGAGGAAGA GCAAAACAAA AGTAAGAAAA AA6CACA6CA AGCAGCAGCT 360 
GACACA6GAC ACAGCAATCA G6TCAGCCAA AATTACCCTA TAGT6CAGAA CATCCAGGGG 420 
CAAATGGTAC ATCA6GCCAT ATCACCTAGA ACTTTAAATG CATGGGTAAA AGTAGTAGAA 480 .• 
GAGAAGGCTT TCA6CCCAGA AGT6ATACCC ATGTTTTCAG CATTATCAGA AGGAGCCACC 540 
CCACAAGATT TAAACACCAT GCTAAACACA GTGGGGGGAC ATCAA6CA6C CAT6CAAATG 600 
TTAAAAGAGA CCATCAATGA GGAAGCTGCA GAATG6GATA GAGTGCATCC AGTGCATGCA 660 
GGGCCTAHG. CACCAGGCCA GATGAGAGAA CCAAGGGGAA GTGACATAGC AGGAACTACT 720 
AGTACCCTTC AGGAACAAAT AGGAT66ATS ACAAATAATjC-CACCTATCCC AGTAGGAGAA 780 . 
ATTTATAAAA GATGGATAAT CCIGGGAHA AATAAAATAG TAA6AATGTA TAGCCCTACC 840 
AGCATTCTGG ACATAAGACA AGGACCAAAG GAACCCTTTA GAGACTATGT AGACCGGTTC 900 
TATAAAACTC TAAGAGCCGA GCAAGCTTCA CAGGA6GTAA AAAA7TGGAT GACAGAAACC 960 
TTGTTGGTCC AAAATGCGAA CCCAGATTGT AAGACTATTT TAAAAGCATT GGGACCAGCG 1020 
GCTACACTAG AAGAAATGAT GACAGCATGT CAGGGAGTAG GAGGACCC6G CCATAAGGCA 1080 
AGAGTirrGG CTGAAGCAAT GAGCCAAGTA ACAAATTCAG CTACCATAAT GATGCAGAGA 1140 
GGCAATTTTA GGAACCAAAG AAAGATTGTT AAGTGTTrCA ATTGTGGCAA AGAAGGGCAC 1200 
ACAGCCAGAA ATTGCAGGGC CCCTAGGAAA AAGGGCTGH GGAAATGTGG AAA6GAAGGA 1260 
CACCAAATGA AAGATTGTAC T6AGAGACA6 GCTAATnTT TAGGGAAGAT CTG6CCTTCC 1320 
TACAAG6GAA GGCCAGGGAA TTTTCTTCAG AGCAGACCAG AGCCAACAGC CCCACCAGAA 1380 
GAGAGCTTCA GGTCT66GGT.AGAGACAACA ACTCCCCCTC AGAAGCAGGA GCCGATAGAC 1440 
AAGGAACTGT ATCCTTTAAC TTCCCTCAGG TCACTCTTTG GCAACGACCC CTCGTCACAA 1500 
TAAAGATAGG GG66CAACTA AAGGAAGCTC TATTAGATAC AGGAGCAGAT GATACAGTAT 1560 
TAGAAGAAAT GAGTTTGCCA GGAAGATGGA AACCAAAAAT GATAGGGGGA ATTGGAGGTT 1620 
TTATCAAAGT AA6ACAGTAT GATCAGATAC TCATAGAAAT CTGTGGACAT AAA6CTATAG 1680 
6TACAGTATT AGTA66ACCT ACACCTGTCA ACATAATTGG AAGAAATCTG TTGACTCAGA 1740 
TTGGTTGCAC TTTAAATTTT CCCATTAGCC CTAHGAGAC TGTACCAGTA AAATTAAAGC 1800 
CAGGAAT6GA TGGCCCAAAA GTTAAACAAT GGCCAHGAC AGAAGAAAAA ATAAAAGCAT 1860 
TAGTAGAAAT TTGTACAGAG ATGGAAAAGG AAGGGAAAAT TTCAAAAATT GG6CCTGAAA 1920 
ATCCATACAA TACTCCAGTA TTT6CCATAA AGAAAAAAGA CAGTACTAAA TGGA6AAAAT 1980 
TAGTAGATTT CAGAGAACH AATAAGAGAA CTCAA6ACTT CTGGGAA6TT CAATTAGGAA 2040 
TACC ACATC C CGCAGGGHA AAAAAGAAAA AATCAGTAAC AGTACTGGAT GTGG6TGATG 2100 
CATATTTTTC AGTTCCCTTA GATGAAGACT TCAGGAAGTA TACTGCATTT ACCATACCTA 2160 
GTATAAACAA TGAGACACCA GGGATTAGAT ATCAGTACAA TGTGCTTCCA CAGGGATGGA 2220 
AA6GATCACC AGCAATATTC CAAAGTAGCA TGACAAAAAT CTTAGAGCCT TTTAGAAAAC 2280 
AAAATCCAGA CATAGHATC TATCAATACA TGGATGATTT GTATGTAGGA TCTGACTTAG 2340 
AAATAGGGCA GCATAGAACA AAAATAGA6G AGCTGAGACA ACATCtGHG AG6T6GGGAC 2400 
TTACCACACC AGACAAAAAA CATCAGAAAG AACCTCCAH CCTTTGGATG GGTTATGAAC 2460 
TCCATCCTGA TAAATGGACA GTACAGCCTA TAGTGCTGCC AGAAA^GAC AGCTGGACTG 2520 
TCAATGACAT ACAGAAGTTA GTGGGGAAAT TGAATTGGGC AAGTCAGATT TACCCAGGGA.2580 



1 



wo 01/79518 



PCT/GBOl/01784 



TTAAA6TAAG GCAATTAT6T AAACTCCTTA 
CACTAACAGA AGAA6CAGAG CTA6AACTGG 
TACATGGAGT GTAHATGAC CCATCAAAAG 
AA66CCAATG GACATATCAA ATTTATCAAG 
ATGCAAGAAT GAGGGGTGCC CACACTAATG 
AAATAACCAC AGAAAGCATA GTAATATGGG 
AAAAGGAAAC AT6GGAAACA TGGT6GACAG 
GGGAGTTTGT TAATACCCCT CCCTTAGTGA 
TAGTA6GAGC AGAAAGCTTC TAT6TAGAT6 
AAGCAGGATA TGTTACTAAT AGAGGAAGAC 
ATCAGAAGAC TGAGTTACAA GCAATTTATC 
ACATAGTAAC AGACTCACAA TATGCATTAG 
AATCAGA6TT AGTCAATCAA ATAATA6AGC 
CATGGGTACC AGCACACAAA GGAATTG6A6 
CTGGAATCAG GAAAGTACTA TTTTTA6ATG 
AATATCACAG TAATTGGAGA GCAATGGCTA 
AAGAAATAGT AGCCAGCTGT 6ATAAATGTC 
TA6ACTGTAG TGCAGGAATA TGGCAACTAd 
TGGTAGCA6T TCATGTAGCC AGTGGATATA 
GGCA6GAAAC AGCATATTTT CTTTTAAAAT 
ATACTGACAA TGGCAGCAAT TTCACCGGTG 
GAATCAAGCA GGAATnPGGA ATTCCCTACA 
TGAATAAAGA ATTAAAGAAA ATTATAGGAC 
CAGCAGTACA AATGGCA6TA TTCATCCACA 
ACAGTGCAGG GGAAAGAATA GTAGACATAA 
AAAAACAAAT TACAAAAATT CAAAATTTTC 
TTTGGAAAGG ACCAGCAAAG CTCCTCTGGA 
ATAGTGACAT AAAAGTAGTG CCAAGAAGAA 
AGATGGCAG6 TGATGATTGT-GTGGCAAGTA 



6AGGAACCAA A6CACTAACA GAAGTAATAC 2640 
CAGAAAACAG AGAGATTCTA AAAGAACCAG 2700 
ACTTAATAGC AGAAATACAG AAGCAGGGGC 2760 
AGCCATTTAA AAATCTGAAA ACAGGAAAAT. 2820 
ATGTAAAACA ATTAACAGAG GCAGTGCAAA 2880 
GAAAGACTCC TAAATTTAAA CTGCCCATAC 2940 
AGTATTGGCA AGCCACCTGG ATTCCTGAGT 3000 
AATTATGGTA CCAGTTAGAG AAAGAACCCA 3060 
6GGCAGCTAA CA6GGAGACT AAATTAGGAA 3120 
AAAAAGTTGT CACCCTAACT GACACAACAA 3180 
TAGCTTTGCA GGATTCGGGA TTAGAAGTAA 3240 
GAATCATTCA AGGACAACCA GATCAAAGTG 3300 
AGTTAATAAA AAAG6AAAAG GTCTATCTGG 3360. 
GAAATGAACA AGTAGATAAA TTA6TCAGTG 3420 
GAATAGATAA GGCCCAAGAT GAACATGAGA 3480 
GTGATTTTAA CCTGCCACCT GTAGTAGCAA 3540 
AGCTAAAAG6 AGAAGCCATG CAT6GACAAG 3600 
ATTGTACACAvTTTAGAAGGA AAAGTTATCC 3660 
TAGAAGCAGA AGmTTCCA GCAGAAACAG 3720 
TAGCAGGAAG ATGGCCAGTA AAAACAATAC 3780 
CTACGGHAG GGCCGCCTGT T66T6GGCGG 3840 
ATCCCCAAAG TCAAGGAGTA GTAGAATCTA 3900 
AGGTAAGAGA TCAG6CTGAA CATCTTAAGA 3960 
ATTTTAAAAG AAAAGGGGGG ATTGGGGGGT 4020 
TAGCAACAGA CATACAAACT AAAGAATTAC 4080 
GGGTrTATTA CAGGGACAGC AGAAAITCAC 4140 
AAGGTGAAGG GGCA6TAGTA ATACAAGATA 4200 
AAGCAAAGAT CATTAGGGAT TATGGAAAAC 4260 
GACAGGATGA GGATTAG 4307 
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TOWlTATTGGCCATTAGCCATATTATraiTTGGTTATATAGCTl^ 

TTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCC 

AATATGACCGCCATGTTGGOITTGATTATTGACTAGTTATTAATAGTAATCAATTA 

GTCATTAGTTCyVTAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATC^ 

GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGCCAATMGGACTTTCCATTQACGTa^ATGGGTGGAGTATOT 

CCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGA 

CGGTAAATGGCCCGCCTGGCy^TTATGCCCAGTACATGACCTTACGGGACTTTCCTACI^ 

GCAGTACATCTACGTATTAGTCT^TCGCTATTACCy^TGGTGATGCGGTTTTGGCAGT^ 

CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCC7VCTC 

aUVTGGGAGTTTGTTTTjKO^CCSU^AATC^ 

CGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 

AGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGCGGTAGT^ 

AGTTAAATTGCTAACGCAGTCyiGTGCTTCTGACACAACaGTCTCGAACTTA^^ 

GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATC^ 

GGTTAOVAGACAGGTTTAAGGAQACCSUITAGAAACTGGGCTTG 

CmXSCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCCACTTTQCCT^ 

AGGTGTCCyiCTCCCAGTTCAATTACAGCTCTTAAGGCTAGAGTACTTAATACGACTCACT 

ATAGGCTAGCCTCGAGAATTCGCCACCATGGGCGCCCQCGCCAGCQTGCTGTC6GGCGGC 

GAGCTGGACCGCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAAAAGAAGTACy^GCTC 

AAGCACATCGTGTGGGCOUSCCGCQAACTGGAGCGCTTCGCCGTGAACCCC^ 

GAGACCAGCGAGGGGTGCCGCCAGATCCTCGGCCAACTGCAGCCCAGCCTGCAAACC^ 

AGCGAGGAGCTGCGCAGCCTGTACAACACCGTGGCCACGCTGTACTGCGTCOICCA^ . 

ATCGAAATCAAGGATACGAAAGAGGCCCrGGATAAAATCGAAGA(3QAACAGAATAAG2i^ 

AAAAAGAAGGCCCAACAGGCraCCGCGGAOlCCGGAa^CAGCaACiaGG^ 

TACCCOlTCGTGCaGAACATCXaiGGGGCAGATGGTOCACC^^ 

CTGAACQCCTGGGTGAAGGTGGTGGAAGAGAAGGCTTTTAGCCCGGAGGTGATACCCATG 

TTCTCAGCCCTGTCAGAGGGAGCCACCCCCOU^GATCTGAACACCATGCTC^ 

GGGGGACACCAGGCCGCaVTGCS^GATGCTGAAGGAGACCATCW^TGAGGAGGCTC^ 

TGGGATCGTGTGCATCCGGTGCaiCGCAGGGCCCATCGCACCGGGCCakGATGCQT^ 

CGGGGCTCAGACATCGCCGGAACGACTAGTACCCOTC^^ 

AACSUlCCCACCaVTCCCGGTGGGAGAAATCTACAAACGCTGGATC^^ 

AAGATCGTGCGCATGTATAGCCCTACCAGOITCCTGGACATCCGCCAAGGCCCGAAG^ 

CCCTTTCGCGACTACGTGGACCGGTTCTACAAAACGCTCCGCGCCGAGCAGGCTAGCCAG 

6AGGTGAAGAACTGGATGACCGAAACCCTGCTGGTCCAGAACGCGAACCCGGACTGCAAG 

ACGATCCTGAAGGCCCTGGGCCCAGCGGCTACCCTAGAGGAAATGATGACCGCCTGTCAG 

GGAGTGGGCGGACCCGGCCACAAGGCACGCGTCCTGGCT6AGGCCATGAGCCAGGTGACC 

AACTCaSCTACOVTCATGATGOlGCGCGGCAACTTTCGGAACCAACGC^ 

TGCTTCAACTGTGGCAAAGAAGGGCACACAGCCCGCAACTGCAGGGCCCCTAGGA^ 

GGCTGTTGGAAATGTGGAAAGGAAGGACACCAAATGAAAGATTGTACTGAGAGACAGGCT 

^TTTTTTAGGGAAGATCTGGCCTTCCCACAAGGGAAGGCCAGGGAATTTT 

AGACO^GAGCCAACAGCCCCACCAGAAGASAGCTTCAGGTTTGGGGAAGAGAC^ 

CcCTCTCAGAAGC»VGGAGCCGATAGAaU«3GAACTGTATCCTTTAGCTTCCCTC^ 

CTCTTTG6CAGCGACCCCTCGTCACAATAAAGATAGGGGGGCAGCTCAAGGAGGCTCTCC 

TGGACACCGGAGCAGACGACACCGTGCTGGAGGAGATGTCGTTGCCAGGCCGCTGGAAGC 

CGAAGATGATCGGGGGAATCGGCGGTTTO^TCaAGGTGCGCCAGTATGACCAGATCCTC^ 

TCGAAATCTGCGGCO^OUVGGCTATCGGTACdGTGCTGGTGGGCCCCAC^^ 

TCATCQGACGCa^CCTGTTGACGCAGATCGGTTGCACGCTGAACTTCCCCATTAGCCCTA 

TCGAGACGGTACCGGTGAAGCTGAAGCCCGGGATGGACGGCCCGAAGGTCAAGCAATGGC 

C^VTTGACT^GAGGAGAAGATCAAGGCACTGGTGGAGATTTGCACAGAGATGGAAAAGG 

GGAAAATCTCCAAGATTGGGCCTGAGAACCCGTACAAOVCGCCGGTGTTCGCaVAT^ 

AGAAGGACTCGACGAAATGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACGC 

AAGACTTCTGGGAGGTTCAGCTGGGCATCCCGCACCCCGCAGGGCTGAAGAAGAAGAAAT 

CCGTGACCGTACTGGATGTGGGTGATGCCTACTTCTCCGTTCCCCTGGACGAAGACTTCA 

GGAAGTACACTGCCTTCACAATCCCTTCGATCAACAACGAGACACCGGGGAOT^ 

AGTACAACGTGCTGCCCOUSGGCTGGAAAGGCTCTCCCGC^TCOT 

CCAAAATCCTGGAGCCTTTCCGCAAACAGAACCCCGACATCGTCATCTATCAGTAC^^ 

ATGACTTGTACGTGGGCTCTGATCTAGAGATAGGGCAGCACCGCACCAA6ATCGAGGAGC 

TGCGCCAGCACCTGTTGAGGTGGGGACTGACCACACCCGACAAGAAGCACCAGAAGGAG^ 
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CTCCCTTCCTCTGGATGGGTTACGAGCTGCJVCCCTGACAAATGGACCGTGCAGCCTATCG 
TGCTGCCAGAGAAAGACAGCTGGACTGTCAACGACATACAGAAGCTGGTGGGGAAGTTG^ 
ACTGGGCCAGTCAGATTTACCCAGGGATTAAGGTGAGGCAGCTGTGCAAACTCCTCCGCG 
GAACCAAGGCACTCACAGAGGTGATCCCCCTAACCGAGGAGGCCGAGCTCGAACTGGCAG 
AAAACCGAGAGATCCTAAAGGAGCCCGTGCACGGCGTGTACTATGACCCCTCCAAGGACC 
TGATCGCCGAGATCCAGAAGCAGGGGCAAGGCCAGTGGACCTATCAGATTTACCAGGAGC 
CCTTCAAGAACCTGAAGACCGGCAAGTACGCCCGGATGAGGGGTGCCCACACTAACGA^ 
•TCAAGCAGCTGACCGAGGCCGTGCAGAAGATCACCACCGAAAGCATCGTGATCTGGGGAA 
AGACPCCTAAGTTCAAGCTGCCCATCCaVGAAGGAAACCTGGGAAACCTGGTGGACAGAGT 
ATTGGCAGGCCACCTGGATTCCTGAGTGGGAGTTCGTCaUVCACCCCTCCCCTC 
TGTGGTACCAGCTGGAGAAGGAGCCCATAGTGGGCGCCGAAACCTTCTACGTGGATGGGG 
CCGCTAACAGGGAGACTAAGCTGGGOUAGCCGGATACGTCACTAACCGGGGC^ 
AGGTTGTCACCCTCACTGACACCACCAACCAGAAGACTGAGCTGCAGGCCATTTAOT^ 
CTTTGCAGGACTCGGGCCTGGAGGTGAACATCGTGACAGACTCTCAGTATTC^ 
TCATTCaAGCCCAGCCAGACCAGAGTQAGTCCGAGCTGGTCAA^ 

TGATCAAGAAGGAAAAGGTCTATCTGGCCTGGGTACCCGCCCACAAAGGCATTGGCGGCA 

ATGAGCAGGTCGACAAGCTGGTCTCGGCTGGCATOUSGAAGGTGCTATTCCT 

TCGACAAGGCCCAGGACGAGCACGAGAAATACaVCAGGAACTGQCGGGCC^^ 

ACTTCAACCTGCCCCCTGTGGTGGCO^GAGATCGTGGCCAGCTGT^ 

Ta^GGGCGAAGCCATGCaTGGCCAGGTGGACTGTAGCCCCGGOlTCTGG^ 

GCACCCATCTGGAGGGCAAGGTTATCCTGGTAGCCGTCCATGTGGCCAGTGGCTACATC^ 

AGGCCGAGGTCATTCCCGCC!GAAACAGGGCAGGAGACAGCCTACTTCCTCCTGAAGCTGG 

CAGGCCGGTGGCCAGTGAAGACCATCCATACTGACAATGGCAGCaUl^ 

CGGTTAAGGCCGCCTGCTGGTGGGCGGGAATCAAGaWSGAGTTCGGGATCCre 

CCCaVGAGTCAGGGCGTCaTCGAGTCTATGAATAAGGAGTTAAAGAAQATTATCGGCra: 

TO^AGATCAGGCTGAGCATCTCAAGACCGCGGTCCAAATGGCGGTATTCATCCACAOT 

TCAAGCGGAAGGGGGGGATTGGGGGGTACAGTGCGGGGGAGCGGATCGTGGACATCATC^ 

CGACCGACATCCAGACTAAGGAGCTGCAAAAGCAGATTACCAAGATTCAGAA 

TCTACTACAGGGACAGO^SAAATCCCCTCTQGAAAGGCCCAGCGAAGCT^ 

OTGAGGGGGCAGTAGTOATCa^GGATAATAQCGAaiTCUVAGGTGG 

CIGAAGATCMTAGGGATTATGGCAAACa^GATGGCGGGTGATGATTGC 

AGQATGAGGATTAGGAATTGGGCTAGAGCGGCCGCTTCCCTTTAGTGAGGGTTAATGC^ 

CGAGCAGACATGATAAGATAO^TTGATGAGTTTGGACAAACC^ 

AAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATT^^ 

TGCAATAAACAAGTTAAGAAO^CAATTGa^TTCMT^ 

ATGTGGGAGGTTOTTTAAAGCAAGTAAAACCTCTACAAATGTGGTAAAATCCGAT^^ 

TCGATCCQGGCTGGCGTAATAGCGAAGAGGCCCGO^CCGATCGCCCTTCCCAAaVGTTGC 

GCAGCCTGAATGGCGAATGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGT 

GGTTACGCGCAGCGTGACCGCTACACTTGCOIGCGCCCTAGCGCCCGCTCCTTTCGC:^ 

CTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCT 

CCCTTTAGGGTTCCGATTTAGAGCTTTACGGOlCCrCGACCGCTUU^^ 

TGATGGTTOICGTAGTGGGCCATCQCCCTGATAGACGGTTTTTCGCCCTOT 

GTCOVCGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTC 

GGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGA 

GCTGATTTAACAAATATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTO 

TGATGCGGTATTTTCTCCTTACGCaiTCTGTGCGGTATTTCACACCGCATACGCGGATCTG 

CGCAGOlCa^TGGCCTGAAATAACCTCTGAAAGAGGAACTTGGTTAGGTACCTTCTGAGG 

CGGAAAGAACCAGCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCC 

AGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGG 

CCCAGGCTCCCGAGCAGGCAGAAGTATGCAAAGCATGCATCTOUVTTAGTC^^ 

AGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCC 

GCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTC 

GCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTT^ 

TTCTTCTGACTICTUVCAGTCTCGAACTTAAGGCTAGAGCCACCATGATTGAAC^ 

TTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAA 

CAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTT 

CTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGG 

CTATCGTGGCTGGCCAC6ACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAA 

GCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCAC 

CTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGO^TGCGGCGGCTGCATACGCTT 

GATCCGGCTACCTGCCCTVTTCGACCACCy^GCGAAACATCGCATCGAGCGAGCACGT^ 
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CGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAbAGCATCAGGGGCTCGCG 
CCAGCCGAACTGTTCGCCAGGCTCAAG6CGCGCATGCCCGACGGCGAGGATCTCGTCGTG 
ACCCATGGCGATGCCTGCTTGCCGAATATCATGGT6GAAAATGGCCGCTTTTCTGGATTC 
ATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGT 
GATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATC 
GCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCG 
GGACTCTGGGGTTCGAAATGACCGACCy^GCGACGCCCAACCTC 

AATAAAATATCTTTATTTTCATTACATCTGTGTGTTGGTTTTTTGTGTGAATCGATAGCG 
ATAAGGATCCGCGTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGC 
CAGCCCCGACACCCGCCy^CACCCGCIXaCGCGCCCTGACGGGCTO^ 

TCCGCTTACAGAOU^GCTQTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTO 

TCa^TCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAG^ 

GTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTraTCGGGGAAATGTC^ 

ACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTOlTGAGAa^^ 

CCCTGATAAATGCTTO^TAATATTGAAAAAGGAAGAGTATGAGTATTCAACA^ 

GTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTG^ 

CT6GTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTG 

GATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCQAAG^ 

AGCAOTITTAAAGTTCTGCTATGTGGCGCGGTATTATC^ 

CAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACT^^ 

GAAAAGCATOTAOSGATGGCATGACAGTAAGAGAATTATGCAGTGCTQCCAT^ 

AGTQATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACC 

QCTTTTTTGCACAACyiTGGGGGATCATGTAACTCGCCTTGATCGTT^ 

AATGAAGCCATACa^CGACGAGCGTGAO^COlCGATGCCTGTAGCAAT^ 

TTGCGCAAACTATTAACTGGCGAACTACTTACTCTAiGC^ 

TGGATGQAGGCGGATAAAGTTGC7UX3ACCa^CrTCrGTO^ 

TTTATTGCTGATAAATCTGGAQCCQGTGAGCGTGGGTCTCGCGGTATCATTG(^^ 
GGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACQACGGGGAGTC^^ 
ATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAQCAT^ 
CTGTCAGACCAAGTTTACTCATATATACTTTAGATTGAT^ 

AAAAGGATCTAGGTG^U^TCCTTTTTGATAATCTCATGACC^^ ' 

TTTTCGTTCCACTGAGCOTCAGACCCCGTAGAAAAGATCAA^ 

TTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACC^^ 

TGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTO^GCAGA^ 

CAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACaVCT^ 

GTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACOVGTGGCTGCTGCCAGT^ 

GATAAGTCQTGTCTTACCGGQTTGGACTCa^GACGATAGTTACCGGATAAGGCGCA^ 

TCGGGCTGAACGGGGGGTTCGTGCTLCACAGCCCAGCTTGGAGCGAAC^^ 

CT6AGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAG6CG 

GACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGOVCGAGGGAGCTTCCA^ 

GGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGA 

TTTTTGTQATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCT^ 

TTACGGTTCCTGGCCTTTTGCTGGCCTTITOCTCACATGGCTCG^ 
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SEQ. ID. NO. 3- Envelope Gene fix)ni HIV-1 MN (Genbank accession no. M17449) 

ATGAGAGTGA A6GGGATCAG GAGGAATTAT CAGCACTGGT GGGGATGGGG CACGATGCTC 60 
CTT6GGTTAT TAATGATCTG TAGT6CTACA GAAAAATT6T GGGTCACAGT CTATTATGGG 120 
GTACCTGTGT GGAAAGAAGC AACCACCACT CTATTTTGTG CATCAGATGC TAAAGCATAT 180 
GATACA6AGG TACATAATGT TTGGGCCACA CAAGCCTGTG TACCCACAGA CCCCAACCCA 240 
CAAGAAGTAG AATTGGTAAA TGTGACAGAA AATTTTAACA TGTGGAAAAA TAACATGGTA 300 
GAACAGAT6C ATGAGGATAT AATCAGTTTA TGGGATCAAA GCCTAAAGCC ATGT6TAAAA 360 
TTAACCCCAC TCTGTGTTAC TTTAAATTGC ACTGATTTGA GGAATACTAC TAATACCAAT 420 
AATAGTACTG CTAATAACAA TA6TAATA6C GAGG6AACAA TAAAGGGAGG AGAAATGAAA 480 
AACTGCTCTT TCAATATCAC CACAAGCATA AGAGATAA^SA TGCAGAAA6A ATATGCACTT 540 
CTTTATAAAC TTGATATAGT ATCAATAGAT AAT6ATAGTA CCAGCTATAG 6TTGATAA6T 600 
TGTAATACCT CAGTCATTAC ACAA6CTT6T CCAAAGATAT CCTTTGAGCC AATTCCCATA 660 
CACTATTGTG CCCCGGCTGG TTTTGCGATT CTAAAATGTA ACGATAAAAA GTTCA6TGGA 720 
AAA66ATCAT GTAAAAATGT CA6CACA6TA CAAT6TACAC ATGGAATTA6 GCCAGTAGTA 780 
TCAACTCAAC TGCTGTTAAA TGGCAGTCTA GCAGAAGAAG AGGTAGTAAT TAGATCTGAG 840 
AATTTCACTG ATAATGCTAA AACCATCATA GTACATCTGA ATGAATCTGT ACAAATTAAT 900 
T6TACAAGAC CCAACTACAA JAAAAGAAAA AGGATACATA TAGGACCAGG GAGAGCATTT 960 
TATACAACAA AAAATATAAT AGGAACTATA AGACAAGCAC ATTGTAACAT TAGTAGAGCA 1020 
AAATGGAAT6 ACACTTTAAG ACAGATAGH A6CAAATTAA AAGAACAATT TAAGAATAAA 1080 
ACAATAGTCT HAATCAATC CTCAGGAGGG GACCCAGAAA HGTAATGCA CAGTUTAAT 1140 
TGTGGAGGGG AATTTTTCTA CTGTAATACA TCACCACTGT TTAATAGTAC TTGGAATGGT 1200 
AATAATACTT GGAATAATAC TACAGGGTCA AATAACAATA TCACACTTCA ATGCAAAATA 1260 
AAACAMTTA TAAACATGTG GCAG6AAGTA GGAAAA6CAA TGTAT6CCCC TCCCAHGAA 1320 
GGACAAATTA 6ATGTTCATC AAATATTACA 6GGCTACTAT TAACAAGAGA TGGTGGTAAG 1380 
GACACGGACA C6AACGACAC C6AGATCTTC AGACCTGGAG GAGGAGATAT GAGGGACAAT 1440 
TGGAGAAGTG AA7TATATAA ATATAAAGTA GTAACAATTG AACCATTAGG AGTAGCACCC 1500 
ACCAAGGCAA AGAGAAGAGT GGTGCAGAGA GAAAAAAGAG CAGCGATA6G AGCTCTGTTC 1560 
CnGGGTTCT TAGGAGCAGC AGGAAGCACT ATGGGCGCAG CGTCA6TGAC GCT6ACGGTA 1620 
CAGGCCAGAC TATTATTGTC TGGTATAGT6 CAACAGCAGA ACAATTTGCT GAGGGCCATT 1680 
GAGGCGCAAC AGCATATGTT GCAACTCACA GTCTGGGGCA TCAAGCAGCT CCAGGCAAGA 1740 
GTCCTGGCTG T6GAAAGATA CCTAAAGGAT CAACAGCTCC TGGGGTTTTG GGGTTGCTCT 1800 
6GAAAACTCA TTTGCACCAC TACTGTGCCT TGGAATGCTA GTT6GAGTAA TAAATCTCTG 1860 
GATGATATTT GGAATAACAT GACCTGGATG CAGTGGGAAA GAGAAATTGA CAATTACACA 1920 
AGCHAATAT ACTCATTACT AGAAAAATCG CAAACCCAAC AAGAAAAGAA TGAACAAGAA 1980 
nATTGGAAT TGGATAAATG GGCAAGTTTG TGGAAHGGT TTGACATAAC AAATTGGCTG 2040 
TGGTATATAA AAATATTCAT AATGATAGTA 6GAGGCTT6G TAGGTTTAAG AATAGTTTTT 2100 



6 



wo 01/79518 



PCT/GBOl/01784 



GCT6TACTTT CTATAGTGAA TAGAGTTAGG CAGGGATACT CACCATTGTC GTTGCAGACC 2160 
CGCCCCCCAG TTCCGAGGGG ACCCGACAGG CCCGAAGGAA TC6AAGAAGA AG6TGGAGAG 2220 
AGAGACAGAG ACACATCCGG TCGAHAGTG CATGGATTCT TAGCAATTAT CT6GGTCGAC 2280 
CTGCGGAGCC TGnCCTCTT CAGCTACCAC CACAGAGACT TACTCTTGAT TGCAGCGAGG. 2340 
ATTGTGGAAC TTCTGGGACG CAGGG6GTGG GAAGTCCTCA AATAnGGTG GAATCTCCTA 2400 
CAGTATTGGA GTCAGGAACT AAAGAGTAGT GCTGTTAGCT TGCTTAATGC CACAGCTATA 2460 
GCAGTAGCTG AGGGGACAGA TAGGGTTATA GAAGTACTGC AAAGAGCTGG TAGAGCTATT 2520 
CTCCACATAC CTACAAGAAT AAGACA6G6C TTGGAAAGGG CTTTGCTATA A 2571 



SEQ. LD.NO.i - S YNgp- 1 60nin - codon optimised env sequence . 

ATGAGGGTGA AGG6GATCCG CCGCAACTAC CAGCACT6GT G6GGCT6GGG CACGATGCTC 60 • 
CTGGGGCTGC TGATGATCTG CAGCGCCACC GAGAAGCTGT GGGTGACCGT GTACTACG6C 120 
GTGCCCGTGT GGAAGGAGGC CACCACCACC CTGTTCTGCG CCAGCGACGC CAAG6CGTAC 180 
GACACCGAGG TGCACAACGT GTGG6CCACC CA6GCGTGC6 T6CCCACCGA CCCCAACCCC 240 
CA6GAGGT6G AGCTCGT6AA CGTGACCGAG AACHCAA.CA T6T6GAAGAA CAACATGGT6 300 
GA6CAGAT6C AT6AGGACAT CATCA6CCT6 TGGGACCAjGA GCCTGAA6CC CTGCGTGAAG 360 
CT6ACCCCCC TGTGCGTGAC CCTGAACT6C ACCGACCTGA GGAACACCAC CAACACCAAC 420 
AACAGCACCG CCAACAACAA CA6CAACAGC 6A6GGCACCA TCAAGGGC6G CGAGATGAAG 480 
AACTGCAGCT TCAACATCAC CACCA6CATC CGCGACAAGA TGCAGAAGGA GTACGCCCTG 540 
CTGTACAAGC T6GATATCGT GAGCATCGAC AACGACAGCA CCAGCTACCG CCTGATCTCC 600 
TGCAACACCA GCGTGATCAC CCAGGCCTiGC CCCAAGATCA GCTTCGAGCC GATCCCCATC 660 
CACTACTGCG CCCCC6CC6G CTTCGCCATC CTGAAGTGCA ACGACAAGAA GHCAGCGGC 720 
AAG6GCAGCT 6CAAGAACGT GA6CACCGTG CAGTGCACCC ACGGCATCC6 GCCGGTGGTG 780 
AGCACCCAGC TCCTGCTGAA C6GCA6CCTG 6CCGAGGAGG AGGTGGTGAT CCGCAGCGA6 840 
AACnCACCG ACAACGCCAA GACCATCATC GTGCACCT6A ATGAGA6CGT GCAGATCAAC 900 
T6CACGCGTC CCAACTACAA CAAGC6CAAG CGCATCCACA TCGGCCCCGG 6CGCGCCTTC 960 
TACACCACCA AGAACATCAT- CGGCACCATC CGCCAG6CCC ACTGCAACAT CTCTAGAGCC 1020 
AA6TGGAACG ACACCCTGCG CCAGATCGTG AGCAA6CTGA AGGAGCAGTT CAAGAACAA.G 1080 
ACCATCGTGT TCAACCAGAG CA6C6GCGGC GACCCCGAGA TCGTGATGCA CAGCtTCAAC 1140 
TGCGGCGGCG AATTCTTCTA CTGCAACACC AGCCCCCT6T TCAACAGCAC CTGGAACG6C 1200 
AACAACACCT GGAACAACAC CACC6GCAGC AACAACAATA HACCCTCCA GTGCAAGATC 1260 
AAGCAGATCA TCAACATGTG GCAG6AGGTG GGCAAGGCCA TGTACGCCCC CCCCATCGAG 1320 
6GCCAGATCC 6GTGCAGCAG CAACATCACC 6GTCT6CTGC TGACCCGCGA CGGCGGCAAG 1380 
GACACCGACA CCAACGACAC CGAAATCTTC CGCCCCGGCG GCG6CGACAT GCGCGACAAC 1440 
TGGAGATCTG AGCT6TACAA GTACAA.GGTG GTGACGATCG AGCCCCTGGG CGTGGCCCCC 1500 
ACCAAGGCCA AGCGCCGCGT GGTGCAGCGC GAGAAGCGGG CCGCCATCGG CGCCCTGTTC 1560 
CTGGGCTTCC TGGGGGCGGC GG6CAGCACC ATGGGGGCCG CCAGCGTGAC CCTGACCGTG 1620 
CAGGCCCGCC TGCTCCTGAG CGGCATCGTG CAGCAGCAGA ACAACCTCCT CCGCGCCATC 1680 
GAGGCCCAGC AGCATATGCT CCAGCTCACC GTGTGGGGCA TCAAGCAGCT CCAGGCCCGC 1740 
GTGCTG6CCG TGGAGCGCTA CCTGAAGGAC CAGCAGCTCC TGGGCTTCTG GGGCTGCTCC 1800 
GGCAAGCTGA TCTGCACCAC CACGGTACCC TGGAACGCCT CCTGGAGCAA CAAGAGCCTG 1860 
GACGACATCT GGAACAACAT GACCTGGATG CAGTGGGAGC GCGAGATCGA TAACTACACC 1920 
AGCCTGATCT ACA6CCTGCT 6GAGAAGAGC CAGACCCAGC AGGAGAAGAA CGAGCAGGA6 1980 
CTGCTGGAGC "TGGACAAGTG GGCGAGCCTG TGGAACTGGT TCGACATCAC CAACTGGCTG 2040 
TGGTACATCA AA.ATCTTCAT CATGATTGTG GGCGGCCTGG TGGGCCTCCG CATCGTGTTC 2100 
GCCGTGCTGA GCATCGTGAA CCGC6T6CGC CAGGGCTACA GCCCCCTGAG CCTCCAGACC 2160 
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CGGCCCCCC6 TGCCGCGCGG GCCCGACCGC CCCGA6GGCA TC6AG6AGGA 66GCGGCGAG 2220 
CGC6ACCGCG ACACCA6CGG CA6GCTC6TG CACGGCTTCC TGGCGATCAT CTG6GTCGAC 2280 
CTCCGCAGCC TCnCCTGTT CAGCTACCAC CACCGCGACC TGCTGCT6AT CGCCGCCCGC 2340 
ATCGTGGAAC TCCTAGGCCG CCGCGGCIGG' GAGGT6CTGA AGTACTGGTG GAACCTCCTG 2400 
CAGTATTGGA GCCAGGAGCT GAAGTCCAGC GCCGTGAGCC TGCTGAACGC CACCGCCATC 2460 
6CCGTGGCCG A6GGCACCGA CCGCGTGATC GAGGT6CTCC AGAGGGCCGG GAGGGCGATC 2520 
CT6CACATCC CCACCC6CAT CCGCCAG6GG CTCGAGA6GG CGCTGCTGTA A 2571 
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SEQ ID No. 5 - pESYNGP 

TCy^TATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCT^TA 

TTGGCCATTGCATACGTTGTATCTATATCATAATATGTACTITTTATATTGGCTCATGTCC 

AATATGACCGCCATGTTGGOVTTGATTATTGACTAGTTATTAATAGTAATCAATO^ 

GTOITTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCC 

GCCTG6CTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 

CCACTTGGCAGTAO^TCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGA 

CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTT^^ 

GOVGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTT^ 

CAATGGGOTTGGATAGCGGITTGACTCACGGGGATTTCCT^GTCTCCACCCC^^ 

CTUVTGGGAGTTTGTTTTGGCACCAAAATOUICGGGACTTTCCAAAATGTCGTAACAA 

CGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 

AGCAGAGOTCGTTTAGTGAACCGTCS^GATCACTAGAAGCTTTATTGCGGTAGT^ 

AGTTAAATTGCTAACGaWSTOUSTGCTTCrGACACAAC^^ 

QACTCTCTOAAGGTAGCCTTGOIGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGT^^ 

GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACT 

CTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCAC 

AGGTGTCCACTCCCAGTTCAATTACAGCTCTTAAGGCTAaAGTACTTAATACGACTCACT. 

ATAGGCTAGAGAATTCGCCACCATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAGAA 

ACTGGAAAAAGTCACCGTTCAGGGTAGCCAAAAGCTTACCACAGG(^TTGCAAC^^ 

ATTGTCCCTGGTGGATCTTTTCCACGACACTAATTTCGTTAAGGAGAAAGATTGGCAACT 

CAGAGACGTGATCCCCCTCTTGGAGGACGTGACCCAAACATTGTCTGGGCAGGAGCGCGA 

AGCTTTCGAGCGCACCTGGTGGGCCATCAGCGCAGTCAAAATGGGGCTGCAAAT^ 

CGTGGTTGACGGTAAAGCTAGCTTTOVACTGCTCCGCGCTAAGTACGAGAAGAAA^ 

CAACAAQAAACAATCCGAACCTAGCGAGGAGTACCCAATTAT^ 

TAGGAACTTCCGCCCACTGACTCCCAGGGGCTATACCACCTGGGTCAACACCATCCAGAC 

AAACGGACTTTTGAAOSAAGCCTCCCT^GAACCTQTTCGGCATCCTGTCTGTGGACTGaV^ 

CTCCGAAGAAATGAATGCTTTTCTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAGAT 

CCTGCTCGATGCCATTGACAAGATCGCCGACGACTGGGATAATCGCCACCCCCTGCCAAA 

CGCCCCTCTGGTGGCTCCCCCACTIGQGQCCTATCCCTATGACCGCTAGGTTCAT^^ 

ACTGGGGGTGCCCCGCGAACGCCMATGGAGCCAGCATTTGACCAATTTAGGCAGACCTA 

CAGACAGTGGATCATCGAAGCCATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCA^ 

GGCACAGAACATCAGGCAGGGGGCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTTCT 

GTCCCAGATTAAATCCGAAGGCCACCCTCAGGAGATCTCCAAGTTCTTGACAGACACACT 

GACTATCCAAAATGCAAATGAAGAGTGCAGAAACGCCATGAGGCyiCCTCAaACCTGAAGA 

TACCCTGGAGGAGAAAATGTACGCATGTCGCGAOVTTGGCACTACCAAGC^^ 

GCTGCrCGCCAAGQCTCTGa^CCGQCCTOGCTGGTCCATTCAAAGGAGGAGC^^ 

GGGAGGTCCATTGAT^GCTGCACAAACATGTTATAATTGTGGGAAGCCAGGACATTTATC 

TAGTCAATGTAGAQCACCTAAAGTCTGTTTTAAATGTAAACAGCCTGGACATTTCTCAAA 

GCAATGCy^GAAGTGTTCCAAAAAACGGGAAGCAAGGGGCTCAAGGGAGGCCCCyVGAAAC^ 

AACTTTCCCGATAOVACAGAAGAGTCAGCAO^CAAATCTGTTGTAC^ 

GACTCAAAATCTGTACCOVGATCTGAGCGAAATAAAAAAGGAATACAATGT^ 

GGATCAAGTAGAGGATCTCAACCTGGACAGTTTGTGGGAGTAACATACAATCTCGAGAAG 

AGGCCCACTACCATCGTCCTGATCAATGACACCCCTCTTAATGTGCTGCTGGACACCGGA 

GCCGACACCAGCGTTCTCACTACTGCTCACTATAACAGACTGAAATACAGAGGAAGGAAA 

TACCAGGGCACAGGCATCATCGGCGTTGGAGGCAACGTCGAAACCTTTOCCACTCCTGTC 

ACCATCTVAAAAGAAGGGQAGACyiCATTAAAACCAGAATGCTGGTCGCCGACATCCCTC^ 

ACCATCCTTGGCAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTG 

TCTAAGQAAATCAAGTTCCGCAAGATCGAGCTGAAAGAGGGCACAATGGGTCCAAAAATC 

CCCCAGTGGCCCCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCCTG 

CTTTCTGAGGGCAAGATTAGCGAGGCO^GCGACAATAACCCTTACAACAGCCCCA^ 

GTGATTAAGAAAAGGAGCGGCAAATGGAGACTCCTGCAGGACCTGAGGGAACT 

ACCGTCCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAA 

TGCAAGO^OITGACAGTCCTTGAGATTGGAGACGCTTATTTTACCATCCCCCTCGATC 

GAATTTCGCCCCTATACTGCTTTTACCATCCCCAGCATCAATCACCAGGAGCCCGATAAA 

CGCTATGTGTGGAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGA^ 
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ACACTTCAAGAGATCCTCCAACCTTTCCGCGaWUUSATACCCy^GAGGTTCaA^^ 

TATATGGACGACCT6TTCATGGGGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCATC 

ATCGAACTGAGGGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAA 

GAAGTTCCTCCATATAGCTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTCCAG 

AAGATCCAGTTGGATATGGTCAAGAACCO^CACTGAACGACGTCCAG^^ 

AATATTACCTGGATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGC^ 

ACy^AAAGGATGCCTGGAGTTGAACCAGAAGGTCTlTTTGGACAGAG^^ 

CTGGAGGAGAATAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAA 

GAAGAAATGTTGTGCGAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCAAA 

CAGTCCCAAGGCATCTTGTGGGCCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGTCC 

ACCGTTAAAAATCTGATGCTCCTGCTCt^Ga^CGTCGCCACCGAGTCTATCACCCGCGT^ 

GGCAAGTGCCCCACCTTO^AAGTTCCCTTCACTAAGGAGCAGGTGATGTGGGA^^^ 

AAAG6CTGGTACTACTCTTGGCTTCCCGAGATCGTCTACACCCACCAAGTGGTGCACGAC 

GACTGGAGAATGAAGCTTGTCGAGGAGCCCACTAGCGGAATTACAATCTATACCGACGGC 

GGAAAGCAAAACGGAGAGGGAATCGCTGCATACGTCACATCTAACGGCCGCACCAAGC^ 

AAGAGGCTCGGCCCTGTCACTCACCAGGTGGCTGAGAGGATGGCTATCCAGA^ 

GAGGAC:yVCTAGAGACy^G<aiGGTGAACa.TTGTGACTGACAGCTACTACTGCTGGAi^^ 

ATCa^CAGAGGGCCTTGGCCTGGAGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGAAT 

ATCCGCGAAAAGGAAATTGTCTATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACGGC 

AACCAACTCGCCGATGAAGCCGCCAAAATTAAAGAGGAAATCATGCTTGCCTACCAGGGC 

ACACAGATTAAGGAGAAGAGAGACGAGGACGCTGGCTTTGACCTGTGTGTGCCATACGAC 

ATCATGATTCCCGTTAGCGACACAAAGATCATTCO^CaSATGTCAAGATCC^^ 

CCCAATTCATrrGGTTGGGTGACCGGAAAGTCCAGCATGGCTAAGCA 

AACGGGGGAATCATTGATGAAGGATACACCGGCGAAATCCAGGTGATCTGCACAAATATC 

GGOUVAAGCAATATTAAGCTTATCGAAGGGCAGAAGTTCGCTCAACTCATCATC 

CACCACAGCAATTCAAGACAACCTTGGQACGAAAACAAGATTAGCCaLGAG^ 

GGCTTCGGOiGCACAGGTGTGTTCTGGGTGGAGAACy^TCCAGGAAGCAaiGGACGAG 

GAGAATTGGCACACCTCCCCTAAGATTTTGGCCCQCAATTAC^ 

GCTAAGaiGATa^(:aVCa^GGAATGCCCCCACTGCACCAAA(^GGTTCT^ 

TGCGTGATGAGGTCCCCCAATCACTGGCAGGCAGATTGCACCCACCTCGACAACAAAATT 
ATCCTGACCTTCGTGGAGAGCAATTCCGGCTACATCCyiCGCTlAC^^ 

AATGCATTGTGCACCTCCCTCGCT^TTCTGGAATGGGCCAGGCTGTTCTCTCai^ 

CTGCACACCGACAACGGCAGCAACTTTGTGGCTOAACCTGTGGTC 

CTGAAAATCGCCaiCACCACTGGCSlTTCCCTATCACCCTGAAAGCC^^ 

AGGGCOU^CAGAACTCTGAAAGAAAAGATCCa^TCTCy^CAGAGACAATAC^ 

GAGGCCGCACTTCAGCTCGCCCTTATCACCTGOVACAAAGGAAGAGAAAG^^ 

CAGACCCCCTGGGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTC 

TTGCAGCAGGCCCAGTCCTCCAAAAAGTTCTGCTTTTATAAGATCCCCGGTGAG 

TGGAAAGGTCCTACAAGAGTTTTGTGGAAAGGAGACGGCGCAGTTGTGGTGAACGATG^^ 

GGOUWSGGGATa^TCGCTGTGCCCCTGACACGCACCAAGCTTCTCATC^ 

ACCCGGGGCGGCCGCTTCCCTTTAGTGAGGGTTAATGCTTCGAGCAGACATGATAAGATA 

CATTGATGAGTTTGGACAAACCA(^CTAGAATGCAGTGAAAAAAATGCTTTAT^ 

AATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAAC^ 

CAA(^TTGa^TT(^TTTTATGTTTCAGGTTCJU3GGGGAGATC^ 

CAAGTAAAACCTCTACAAATGTGGTAAAATCCGATAAGGATCGATCCGGGC^ 

AGCGAAGAGGCCCGCACCGATCQCCCTTCCCAACAGTTGCGCA6CCTGAATGGCGAATGG 

ACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCG 

CTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCA 

CGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTA 

GAGCTTTACGGCACCTCGACCGCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGC 

CT^TCGCCCTGATAGACGGTTrTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTG 

GACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTAT 

AAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAATATTTA 

ACGCGAATTTTAACAAAATATTAACGTTTAGAATTTCGCCTGATGCGGTATTTTCTCCTT 

ACGCT^TCTGTGCGGTATTTCACACCGCATACGCGGATCTGCGCAGCACaiTGGCCTG^ 

TAACOTCTGAAA6AGGAACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGTGGA 

ATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAA 

GCATGOVTCTCAATTAGTCyVGCAACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGG^ 

GAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGC 

CCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCC6CCCCATGGCTGACTAATO 

TTTTTATTTATGOVGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAG^^ 
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GAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAGTCT 

CGAACTTAAGGCTAGAGCCACCATGATTGAACAAGATGGATTGCACGCS^GGTTCTCCG^ 

CGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGA 

TGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCT 

GTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGAC 

GGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCT 

ATTGGGCGAAGT6CCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGT 

ATCCATCATGGCTGATGCa^ATGCGGCGGCTGCTlTACGCTTGATCCGGCTACCTGCCCATT 

CGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGT 

CGATCAGGATGATCTGGACQAA6AGCATCAGGGGCTCGCGCCAGCCGAACTGTTC6CCAG 

GCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTT 

GCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGQATTCATCGACTGTGGCCGGCTGGG 

TGTGGCQGACCGCTATCAGGACyiTAGCGTTGGCTACCCGTGATATTGCTGAAGAGCT^ 

CGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCG 

CATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATG 

ACCGACCAAGCGACGCCCAACCTGCCATCACGATGGCCGCAATAAAATATem 

ATTACATCOXSTGTGTTGGTTTTTTGTGTGAATCGATAGCGATAAGGATCCGCGTATGGTG 

CACTCTOlGTACS^TCTGCrCTGATGCCGCATAGTTAAGCCAGCC^ 

ACCCGCTGACGCGCCCIXSACGGGCTTGTCTGCTCCCGGCATCCGCTTA^ 

GACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCOAG 

ACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTC 

TTAGACGTGAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTT^ 

CTAAATACyiTTO^TATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTC^ 

ATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTT^ 

TGCGGCa^TTTTGCCTTCCTGTTTTTGCTCa^CCCAGAAACGC^^ 

TGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGAT 

CCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCa^TGATGAGCACTTTTAAAGTTCT^ 

ATGTGGCGCGGTATTATCCCGTATTGACGCOSGGCSAGAGCAACTCGGTCGCCGCATACA 

CTATTCTOIGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGAT^ 

CATGAC^GTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGC^ 

CTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTin^TTGCACAACATGGG 

GGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGA 

CGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGC^^ 

CGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTOGATGGA^ 

TGOlGGACCACrraCTGCGCTCGGCCCTTCaSGCTGGCTGGTTTAOT^ 

AGCCG6TGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTC 

CCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACA 

GATCGCTGAGATAGGTGCCT<:yiCTGATTAAGCATTGGTAACTGTa^GACC?U^GTTTAC 

ATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGAT 

CCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAOTTTTCGTTCC^ 

AGACCCCGTAGAAAAGATGRAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATOT^ 

CTGCTTGCaU^CaU^AAAAACCACCGCTACCAGCGGTC^ 

ACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCT 
TCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCT 
CGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAQTCGTGTCTTACCGG 
6TTG6ACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTC 
GTGOlO^CAGCCCy^GCTTGGAGCGAACQACCTACACCGAACTGAGATACCTACAGCGTGA 
GCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGG 
CAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTA 
TAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGG 
GGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTG 
CTGGCCTTTTGCTCACATGGCTCGACAGATCT 
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TCAATATTGGCCATTAGCCATATTATTCATTCMTTATATAGCATAAATCaU^TATTGGCTA 
TTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTQGCTCATGTCC 
AATATGACCGCCATGrTGGCATTGATTATTCACTAGTTATTAATAGTA^^ 

GTOVTTAGTTCATAGCCCATATATGQAQTTCCGCGTTACATAACTTACGGTAAATGGCCC 
6CCTGGCTGACCGCCCAACQACCCCCGCCCATTGAC6TCAATAATGACGTATGTTCCCAT 
AGTAACGCOUVTAGGOACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 
CO^CTTGGCAGTAOVTCAAGTGTATCSVTATGCa^GTCCGCCCCCTATTGACGTCAATGA 
CGGTAAATGGCCOKrCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 
GCAGTACATCTACGTAT-EAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 
CAATGGGCGTQQATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT 
CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCSyUiATC^^ 

cgatcgcccgccccgttcacgcaaatgggcggtaggcgtqtacggtgggaggtcStaS 

AGCA6AGCTCGTTTAGTGAACCGTaM»Ta«:TAaAAGCTTTATTGCGGTAGTTTATCAC 
AGTTAAATTGCTAACQCAGTCAGTGCTTCTGACAauvCAGTCTCGAACTTAAGCTGCAGT 
GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAA 
GGTTACAAGACAGGTTTAAGGAGACa\ATAGAAACTGGGCTTGTCGAGACASAGAAfiAOT 
CTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGAOirCCaCTTTCC 

aggtgtccactccoigttcaatoacagctcttaaggctagagtacttaatacgaScSt 
ataggctagagaattcoagaqgggcgcagaccctacxjtgttgaacctggctgattoSS 

ATCCCCGGGA(3«KaiGAGGAGAACTTACAGAAGTCTTCTGGAGGTGTTCCTGGcSGS 
ACAG6AGGACAGGTAAGATGGGCQATCCCCTCACCTGGTCCAAAGCCCTGAAGAAACTGG 
AAAAAGTCU^CCGTTCAGGGTAGCCAAAAGCTTACCACAQGCaATI^ 
^n^^°f*^'''^^^^^^®^^CT^TTTCGTTAAGaAGAAAGATTQG^^ 

acgtgatccccctcttggaggacgtgacccaaacattctctgggoiggaqScgaS 

TCGAGCGOlCCTGGTGGGCCATCAGCGaVGTCAAAAa^^ 

ttgacggtaaagctagctttovactgctccgcgctaagtacgaga^^ 

AGAAACAATCCGAACCTAGOSAGGAGTACCCAATTATGATCGACGGCGCCGGCAAT^ 
ACTTCCGCCCACTGACTCCO^GGGGCTATACCyVCCTGGGTCAACACCaTCCaScS^ 
GACTTTa^CGAAGCCTCCCAGAACCTGTTCGGOVTCCTOTCnXST^ 

AAGAAATGAATGCTTTTCTaSACGTGGTGCCAQGACAGGC^ACAGAAACAGATCOT^ 
TCGATGCCArreaO^TCGCCQACGACTGGGATAATCGCCAcS^ 

CTCTGGTQGCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGG^™ 

gggtgccccgcgaacgccagatggagccagcatttgaco^tSaSS^^JS? 
agtggatcatcgaagccatgagcgaggggattaaagxcatgaSg^^ 

AGATTAAATCCGAAGGCCACCCTaUSQAGATCTCaVAGTTCTTGACAGA^ 
TCCAAAATGO^TGAAGAGTGCAGAAACGCCATGA^^^^ 

tggaggagaaaatgtacgcatgtcgcgacattggcactaccaagcWSt^ 
?I™'^°^^''^^^^tgttataattgtgggaagccaggacatttSc^^ 

GCAGAAGTGTTCCAAAAAACGGGAAGCAAGGGGCTOAGGGAGGCCCa^GAAAC^^ 

tcctoatacaacagaaoaqtcagcaoacsvaatctgttgtacaagagS^ 

AJAATCTQTACCCAGATCTGAGCGAAATAAAAAAGGAATACAATGTcS^^ 

aagtagaggatctcaacctggacagtttgtgggagtaacatacaatctcSgSSS^ 
cactaccatcgtcctgatcaatgacacccctcttaatgtgctStS^ 

GGGCACAGGCaVTCyvTaSGCMTTGGAGGawvCGTCGAAACCTTTTCCACTCCTGT^ 

caaaaagaaggggagacacattaaaaccagaato^^ 

ccttggcagagacattctccaggacctgggcgctaaactcgtgctggcaS^ctSotS 
ggaaatcaagttccgcaagatcgagctgaaagagggcacaatScSS^tcJS?^ 
gtggcccctgacc^agaagcitoagggcgctaaggaaatcgtgS^ 

TGAGGGCAAGATTAGaSAGQCCAGCGACAATAACCCTTACAAa^GCCcSSST^^ 

taagaaaaggagcggcaaatggagactcctgcaggacctgSaS^ 

CCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCciS^ 

gcacatgacagtccttgacattggagacgcttattttaccatccccctcStcc^^ 

TCGCCCCTATACTGCTTTTACCATCCCCAGCaTOVATCACCAGSGScSJS^ 
TGTCTGGAAGTGCCTCCCCCS^GGGATTTGTGCTTAGCrcCTASTSS 

tcaagagatcctccaacctttccgcgaaagatacccagaggttSS^ 
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GGACGACCTGTTCATGGGGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCATCATCGA 

ACTGJM3GGC3VATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAAGAAGT 

TCCTCCATATA6CTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGAAA6TCCAGAAGAT 

GCAGTTGGATATGGTCAAGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGCAATAT 

TACCTGGATQAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCaACTACAAA 

AGGATGCCTQaAQTrrGAACaWlAAQQTCATOTGGACAGAfiGAAGCTCAGAAGG^^ 

GGAaAATAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAAGAAGA" 

AATQTTGTGCOAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCAAACAGTC 

CCAAGGCATCTTGTGGGCCGGAAAGAAAATO^TGAAGGCCAACaVAaaGCTGQTCCACCGT 

TAAAAATCTGATGCTCCTGCTCCAGCACGTCGCCACCGAaTCTATCACCCGCGTCGGCRA 

STSCCCCACCTTCaAAGTTCCCTTCACTAAQaAQCAGGTGATGTGGGASATGCAAAAA^ 

CTGGTACTACTCTTGQCTTCCCGAGATCGTCTACACCavCCAAGTGGTGCACGACGACTG 

GAGAATQAftfiCTTGTCGAGGAGCCCACTAGCGGAATTACAATCTATACCGACGGCGGAAA 

GCAAAACGGAGAGGGAATCGCTGCATACGTCACATCTAACGQCCGCACCAAGCAAAA6AG 

GCTCGGCCCTGTCACTOlCCAGGTGGCrGAGAGGATGGCTATCCAGATGGCCCTTGAGGA 

(aCTAGAGACaAfiCafiaTGAACATTGTGACTQACAQCTACTACTGCTGG^ 

AGAGGGCCTTGQCCTGGAGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGAATATCCG 

CGAAAAGOaAATTGTCTATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACGGCAACCA 

ACTCGCCGAT6AAGCCGC(aAAATTAAAGAGGAAATCATGCTT6CCTACCaGGOC3^CA 

GATTAAGGAGAAGAGAGAaSAGQACGCTGGCTTTGACCTGTGTGTGCCATACGACaT^^ 

GATTCCCGTTAGCGAOiaUUUSATCATTCCS^CaSATGTaU^TCCaGGTGC^ 

TTCATTTGGTTGGGTGACCGGAAAGTCCAGCATGGCTAAGCAGGGTCTTCTGATTAACGG 

GGGAATO^TTGATOAAGGATAaiCaMCGAAATCCAGGTGATCTGCACAAATATCGGCAA 

AAGCAATATTAAGCTTATCGAAGGGCAGAAGTTCGCTCAACTCATCATCCTCCAGCACCA 

OMSauVTTCAAGACAACCTTGGGACGAAAACAAGATTASCCAGAGAGGTGACAAGGGCTT 

CGGCaGCACAGGTGTGTTCTGGGTGQAQAACATCaVGGAAGauaMSGACGASCa 

TT6GCACACCTCCCCTAAGATTTTGQCCCQCAATTACAAGATCCCACTGACTGTGGCTAA 

QATaAGOT(XCC(»ATCACTGGaVGGCAGATTGCACCCACCTCGA(:a^ai^ 

QACCTTC6TGGAGAGCaATTCCGGCTAaVTCCACGCAACACTCCTCTaaU«3Gafl^ 

AlTGTGCACCTCCCTCGa^TTCTGGAATGGGCCSVGGCTGTTCTCrCCAAAATCCCTGaV 

CACCGACAACGGCACCAACTTTGTGGCTQAACCTGTGGTQAATCTaCTaaAGTTCCTGAA 

AATCGCCCaCSia^CTGQCATTCCCTATCACCCTGAAAGCCaGGGCATTX^^ 

CAAOWSAACTCTQAAAQAAAAGATCCAATCTCAaiGAGACAATACACAGACATTGGAGGC 

CGCACTTCaGCTCGCCCTTATCACCTGCAACAAAGGAAGAGAAAGCATGGGCGGCCAGAC 

CCCCTGGGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCT6CTCTTGCA 

GCAGGCCCAGTCCTCCAAAAAGTTCTGCTTTTATAAGATCCCCG6TGAGCACGACTGGAA 

AG6TCCTACAAGAGTTTTGTGGAAAGGAGACGGCGCAGTTGTGGTGAACQATGAGGGCAA 

GGGGATO^TCGCTGTGCCCCTGAOVCGCACCaAGCTTCTCATCAAGCCAAACTGAACCCG 

QGGaSGCCGCTTCCCTTTAGTGAGGGTTAATGCTTCGAGCAGACATGATAAGATACATTG 

ATQAGTTTGGACyUVACCACaACTAGAATGCAGTGaAAAAAATGCTTTATTTGTGAAAm 

GTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAACAACAACA 

ATTGCATTCATTTTATGTTTCAGGTTCaGGGGGAGATGTGGGAGGTTTTTTAAAG 

AAAACCTCTACAAATGTGGTAAAATCCQATAAGGATCGATCCGGGCTGGCGTAATAGCGA 

AGAGGCCCGCS^CCGATCGCCCTTCCO^aWSTTGCGCAGCCTGaATGGCGAATGGACGCG 

CCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACA 

CTTGCCA6CQCCCTAQCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTC 

GCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCaATTTAGAQCT 

TTACGGCACCTCGACCGCAAAAAACTTQATTTGGGTGATGGTTCACGTAQTCGGCCATCG 

CCCTGATAGACGGTTTOTCQCCCTTTOACQTTQGAQTCCACGTTCTTTAATAGTGGACTC 

rTGTTCCAAACTGQAACaACaCTCAACCCTATCTCGGTCTATrCTTTTGATTTATAAGGG 

ATTTTQCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAATATTTAACGCG 

AATTTTAACAAAATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTCCTTACGCA 

TCTGTGCGGTATTTCACACCGCATACGCGGATCTGCGCAGCACCATGGCCTQAAATAACC 

TCTGAAAGAGGAACTTGGTTAQGTACCTTCTOAGQCGGAAAfiAACCAGCTGTGGAATGTG 

TGTCAGTTAGGGTGTGGAAAGTCCGCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATG 

CATCTCAATTAGTCAGCAACCAGGTGTGGAaAGTCCCCAGGCTCCCCAGCAGGCAGAAGT 

ATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATC 

CCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCAT6GCTGACTAATTTTTTTT 

ATTTATGCAGAGGCCGAGGCaSCCTCGGCCTCTGAGCTATTCCaGAAGT^^ 

TTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACaGTC^ 
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TTAAGGCTAGAGCCACCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTT 

GGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCG 

CCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTC^GACCGACCTGTCCG 

GTGCCCTGAATGAACTGCSVGGACGAGGCAGCGCGGCTATCGTGGCTGGCC^^ 

TTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGG 

GCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCA 

TCATGGCTGATGCAATGCGGC6GCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACC 

ACaU\GCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAA6CCGGTCTTC^ 

AGGATGATCTGGACGAAGAGCATCAQGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCA 

AGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGA 

ATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGG 

CGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCG 

AATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCG 

CCTTCTATCGCCTTCTTQACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGA 

CCAAGCGACGCCCAACCTGCCATCACGATGGCCGCAATAAAATATCTTTATTTTCyvI^ 

ATCTGTGTGTTGGTTTTTTGTGTGAATCGATAGCGATAAGGATCCGCGTATGGTGCACTC 

TCAGTACAATCTGCTCTGATGCCGCATAGrrTU^GCCAGCCCCGACACCCGCC^ 

CTGACGCGCCCTGACGGGCOTI'GTCTGCrCCCGGCATCCGCTTACAGACAAGCTGTGA^ 

TCTcdsGGAGCTGCATOTGTC^WSAQQTTTTC^^ 

AGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGA 
CGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAA 
TACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATAT^ 
GAAAAAGGAAGAGTATGAGTATTCAAGATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGG 
CATTTTGCCTTCCTGTTTTTGCTCACCCa^^ 

ATCAGTTG6GTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTG 

AGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCyiCTTTTAAAGTTCTGCTATGTG 

GCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATT 

CTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCyiTCT^ 

CAGTAAGAGAATTATGCAGTGCrrGCCATAACCATO 

TTCI^CAAGGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCA 

ATGTAACTCGCCnTGATCGTTGGQAACCGGAGCTGAATGAAGCCATACa^ 

GTGACACCACGATGCCTGTAGCAATGGCy^CAACGTTGCGCAAACTATTAACTC^ 

TACTTACTCTAGCTTCCCGGO^ACAATTAATAGACTGGATGGAGGCGGATAAAGT^ 

GACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCG 

GTGAGC6T6GGTCTCGCGGTATCATTGCAGCACTGGGGCCA6ATGGTAAGCCCTCCCGTA 

TCQTAGTTATCTACACGACOGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCG 

CTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATA 

TACTTTAGATTGATTTAAAACTTCATTTTTAATTrAAAAGGATCTAGGTGAAGATCCTTT 

TTGATAATCTCATGACCy^AAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGT(^ 

CCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCT 

TGCTUU^CyVAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATC^^ 

CTCTOmPCCGAAGGTAACTGGCTTCAGCAGAGCGCAGAT^ 

TGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCa^CCGCCTACATACCTCGCTC 
TGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGG 
ACTCAA6ACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCA 
CACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCQTGAGCTAT 
GAGAAAGCGCCUICGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGG 
TCGGAACAGGAGAQCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATA6TC 
CTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGC 
GGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTl^TACGGTTCCTGGCCTTTTGCTGGC 
CTTTTGCTCACATGGCTCGACAGATCT 



14 



wo 01/79518 



PCT/GBOl/01784 



SEQ ID No. ■? - pESYNGPRRE 

TOUlTATTGGCCATTAGCCyVTATTATTaiTTGGTTATATAGCarAAATaUlTATTGaCTA' 

TTGGCO^TTGCATACGTTGTATCTATATCATAATATGTAOlTTTATATTGGCrCATGTCC 

AATATGACCGCOVTGTTGGOVTTQATTATTCACTAGTTATTAATAGTAATCAATTACGGG 

GTCATTAGTTCATAGCCCATATATGGAGTTCCGC6TTACATAACTTACGGTAAATGGCCC 

GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCC31T 

A6TAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 

CCACTTGGC3«3TAa^Ta«GTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCaATGA 

CGGTAAATGGCOTGCCTGGO^OTATGCCCyVGTACaTQACCTTACGGGACTTTCCTACTTG 

GCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 

OATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCa^CCCCATTGACGT 

a^TGGGAGTTTGTTTTGGCACCAAAATaUi.CGGGACTTTCaU\AAT6TCGTAAC^ 

CGATCGCCaSCCCaSTTGACQOWU^TGGGCGGTAGGCraTGTACGGTGGGAGGTCTATATA 

AGOlGAGCrCGTTTAGTGAACCraTCAGATaiCTAGAAGCTTTATTGCGGTAGTTTATCAC 

AGTTAAATTGCTAACGCaGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGCAGT 

GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAA 

GGTTACAAGACaWSGTTTAAGGAGACCAATAGAAACTGGGCTTGTCQAGACAGAfiAAGACT 

CTTGCGTTTCTQATAGGCaCCTATTGGTCTTACTGACATCCACmTO 

AGGTGTCCACTCCO^TTCa^TTACAfiCTCOTAAGGCTAGAGTACTTAATACGACTCS^CT 

ATAQQCTAQAGAATTCGCCACCATGQGCGATCCCCTCACCTGGTCCAAAGCCCTGAAGAA 

ACTGGAAAAAGTCACCGTTCAGGGTAGCCAAAAGCTTACCACAGGCAATTGCAACTGGGC 

ATTGTCCCTGGTGGATCTTTTCCACGACACTAATTTCGTTAAGQAGAAAGATTGGCAACT 

CAGAGACGTGATCCCCCTCTTGGAGGACGTQACCCAAACATTGTCTGGGCAGaAGCGCQA 

AGCTTTCGAGCGCACCTGGTGGGCCATCAGa3CafiTCaAAATGGGGCTX3C^ 

CQTGGTTGACX3GTAAAGCTAQCTTTCAACTGCTCCGCGCTAAGTACGAGAAGAAAACCGC 

CAACAAGAAACAATCCGAACCTAGCGAGGAGTACCCAATTATGATCQACGGCGCCGGCAA 

TAGQAACTTCCGCCCACTGACTCCCAGGGGCTATACCACCTGGQTCAACACCATCCAGAC 

AAACGGACTTTTGAACGAAGCCax:CCS^GAACCTGTTa3GCATCCTGTCTGTGQACT6CSlC 

CTCCGAAGAAATGAATGCTTTTercaACGTGQTGCCAGGACaGaCTQaACAQAAAC^ 

CCTQCTCOATCCCATTGUiaUGATCGCCSACGACTGGGATAATarca^^ 

CGCCCCTCTGGTGQCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGG 

A(SX3GGGQTQCCCCQCaAAaK:CAGATGGAGCCAGCATTTGACCAATTTAGGaiG^^ 

CAGACAGTGGATCATCGAAGCCATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCAA 

GGCACAGAACyVTCAGGCAGGGGGCCAAGGAACCATACCCTGAGTTTGTCmCAGGCTTCT 

GTCCavGATTAAATCCGAAGGC(M.CCCTaiGQAGATCTC(^GTTCTTGA(yu^ 

GACTATCCAAAATGCAAATGAAGA6TGCAGAAACGCCATGAGGCACCTCAGACCTGAAGA 

TACCCTGGAGGAGAftAATGTACGCaVTGTCGCGACATTGGCACTACCAAGCAAAAGATGAT 

GCTGCTCGCCAAGGCTCTGCAAACCGGCCTGGCTGGTCCATTCAAAGGAG6AGCACTGAA 

GQGAGGTCaVTTGAAAGCTGCa^CAAACATGTTATAATTGTGGGAAGCCAGGACATTTArC 

TAGTCAATGTAGAGCACCTAAAGTCTGTTTTAAATGTAAAC3U3CCTGGACATTTCTCAAA 

GOATGCAGAAGTGTTCaW^AAAACGGGAAGCSUWSGGGCTCAAGGGAGGCCCCAGAAA^ 

AACTTTCCCGATAOACAGAAGAGTCAGCAaU^aU^TCTGTTGTAOU 

GACTCS^AAATCTGTACCCAGATCTGAGCGAAATAAAAAAGGAATACAATGTCAAGGAGAA 
GGATCAAGTAGAGGATCTCAACCTGGACAGTTTGTGGGAGTAACATACAATCTCGAGAAG 
AGGCCCACTACCATCGTCCTGATCAATGACACCCCTCTTAATGTGCTGCTGGACACCGGA 
GCCGACACCAGCGTTCTCACTACTGCTCACTATAACAGACTGAAATACaGAGGAAGGAAA 
TACOWSGGOiCafiGa^TCATCGGaSTTGGAGGa^TOTCGAAACCOTT^^ 

ACCATCAAAAAGAAGGG6AGACACATTAAAACCAGAAT6CTGGTCGCCGACATCCCCGTC 
ACCATCCTTGGCAGA6ACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTG 
TCTAAGGAAATCAAGTTCCGCAAGATCGAGCTGAAAGAGGGCACAATGGGTCCAAAAATC 
CCCCA6TGGCCCCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTQCAGCGCCTG 
CTTTCTGAGGGCAAGArrAGCGAGGCaWKJQACAATAACCCTTACAACAGCCCCATCTTT 
GTGATTAAGAAAAGGAGCGGCAAATGGAGACTCCTGCAGGACCTGAGGGAACTCAACAAG 
ACCGTCCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACGCCGGCGGCCTGATTAAA 
TGCflAGCACATGACAGTCCTTGACATTGGAGACGCTTATTTTACCATCCCCCTCGATCCT 
GAATTTCGCCCCTATACTGCTTTTACCATCCCCASCATCAATCACCAQGAGCCCGATAAA 
CGCTATGTGTGaAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGAAG 
ACACTTCAAQAOATCCTCCAACCTrPCCGCGAAAGATACCCAGAGGTTCAACTCTACCAA 
TATATGQACQACCTGTTCATGGGGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCATC 
ATCXSAACTGAGGQCAATCCTCCrGGAOAAAGGCTTCGAGACACCCQACGACAAGCTGCAA 
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QAAGTTCCTCCATATAGCTGGCTGGGCTACCAGCTTTGCCCTGT^AAACTGGAAAGTCCAG 

AAGATGCAGTTGGATATGGTCAAGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGC 

AATATTACCTGGATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACT 

ACAAAAGGATGCCTGGAGTTGAACCAGAAGGTa^TTTGGACAGAGGAAGCTCAGA^ 

CTGGAGGAGAATAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCG 

GAAGAAATGTTGTGCGAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCAAA 

CAGTCCCAAGGCATCTTGTGGGCCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGTCC 

ACCGTTAAAAATCTGATGCTCCTGCTCCAGCACGTCGCCACCGAGTCTATCACCCGCGTC 

GGOUUSTGCCCCACCTTCAAAGTTCCCTTCACTAAGGAGCAGGTGATCTGGGAGATGC^ 

AAAGGCTGGTACTACTCTTGGCTTCCCGAGATCGTCTACACCCACCAAG^^ 

GACTGQAGAATGAAGCTTGTCGAGGAGCCCACTAGCGGAATTACAATCTATACCGACGGC 

GGAAA6CAAAACGGAGAGGGAATCGCTGCATACGTCACATCTAACGGCCGCACCAAGCAA 

AAGAGGCTCGGCCCTGTCACTCACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCCTT 

GAGGACACTAGAOAaUUSCAGGTGAACATTGTGACTGACAGC^ 

ATOIO^AGGGCCTTGGCCTGGAGGGACCCCAGTCTCCCTGGTGGCCTATCATCC^ 

ATCCGCGAAAAGGAAATTGTCTATTTCGCCTGGGTGCCTGGACAC^^ 

AACCAACTCGCCGATGAAGCCGCCTUkAATTAAAGAGGAAATGATGCTTGCCTACCA 

ACACAGATTAAGGAGAAGAGAGACGAGGACGCTGGCTTTGACCTGTGTGTGCCATACGA^ 

ATCATGATTCCCGTTAGCGACACAAAGATCTlTTCCaUVCCGATGTCTVAGATC 

CCCAATTCaiTTTGGTTGGGTGACCGGAAAGTCaiGCATGGCTAAGCAGGGT^ 

AACGGGGGAATCATTGATGAAGQATAa^CCGGCGAAATCCAGGTGATCTGCAC^^ 

GGCAAAAGCW^TATTAAGCTTATCGAAGGGC^GAAGrrCGCTCAACTC^ 

CACCACAGCAATTCAAGACAACCTTGGGACGAAAACAAGATTAGCCA 

GGCTTCGGCAGCACAGGTGTGTTCTGGGTGGAGAACATCCAGGAAGCACAGGACGAGC^^ 

GAGAATTGGCACa.CCTCCCCTAAGATTTTGGCCCGaUlTTAa\AGATCCCACTGACTGTC 

GCTAAGCAGATCACaVCAGGAATGCCCCCACTGCACa^CAAGGTTCTGQCCCCGC 

TGCXSTGATGAGGTCCCCCAATCACTGGCAGGCAGATTGCACCCACCTCGAC^ 

ATCCTGACCTTCGTGGAGAGCy^TTCCGGCTACATCCACGCAACACTCCTCTCC^ 

AATGCATTGTGCACCTCCCTCGCAATTCTGGAATGGGCt^GGCTGTTCTC^ 

CTGCaiCACCGACAACGGCACaUVCTTTGTGGCTGAACCTGTGQTGAATCTGCTGAAGTTC 

CTOAAAATCGCCCACACCACTGGCATTCCCTATCaiCCCTCAAAG^^ 

AGGGCCAACSWSAACTCTGAAAGAAAAGATCCaATCTCACA^^^ 

GAGGCCGCACOTCTVGCTCGCCCTTATCACCTGaUVCAAAGG^ 

CAGACCCCCTGGGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTC 

TTGCAGCAGGCCCAGTCCTCCAAAAAGTTCTGCTTTTATAAGATCCCCGGTGAGCACGAC 

TGGAAAGGTCCTACAAGAGTTTTGTGGAAAGGAGACGGOSCAGTTGTGGTGAACGATQAG 

GGaulGGGGATCATCGCTGTGCCCCTGACACGCaV.CCAAGCTTCT 

ACCCGACGAATCCCAGGGGGAATCTCTACCCCTATTACCCauVCAGTC^ 

TGTGAGGAGAACACT^TGTTTCTUVCCTTATTGTTATAATAATGACAGTAAGAACAGC^ 

GCAGAATCGAAGGAAGCAAGAGACCAAGAAATGAACCTGAAAGAAGAATCTAAAGAAGAA 

AAAAGAAGAAATGACTGGTGGAA?^TAGGTATGTTTCT6TTATGCTTAGCCAGGGCCCTC 

TGGAAGGTGACCAGTGGTGCAGGGTCCTCCGGCAGTCGTTACCTGAAGAAAAAATTCCAT 

CACAAACATGCATCGOaAGAAGACACCTGGGACCAGGCCCAACAC^ 

GGCGTGACCGGTGGATCAGGGGAaVAATACTACAAGCAGAAGTACTCCAG<^ 

AATGGAGAATCAGAGGAGTACAACAGGCGGCCAAAGAGCTGGGTGAAGTCAATCGAGGCA 

TTTGGAGAGAGCTATATTTCCGAGAAGACCAAAGGGGAGATTTCTCAGCCTGGGGCGGCT 

ATCAACGAGCACAAGAACGGCTCTGGGGGGAACAATCCTCACCAAGGGTCCTTAGACCTG 

GAGATTCGAAGCGAAGGAGGAAACATTTATGACTGTTGCATTAAAGCCCAAGAAGGAACT 

CTCGCTATCCCTTGCTGTGGATTTCCCTTATGGCTATTTTGGGGGTCGGGGCGGCCGCTT 

CCCTTTAGTGAGGGTTAATGCTTCGAGCAGACATGATAAGATACATTGATGAGT^ 

AAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTG 

CTTTATTTGTAACCATTATAAGCTGCAATAAACy^GTTAACa^au^ 

rpTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAAAGCAAGTAAAACCTCTAt^ 

AATGTGGTAAAATCCGATAAGGATCGATCCGGGCTGGCGTAATAGCGAAGAGGCCCGCAC 

CGATCGCCCTTCCCyUVCAGTTGCGCAGCCTGAATGGCGAATGGACGCGCCCTGTAGCGGC 

GCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCC 

CTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCC 

CGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGAGCTTTACGGCACCTC 

GACCGCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGAT^^ 

GTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCrTGTTCC^^ 

GGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTO 
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TCGGCCTATTGGTTAAAAAATGAGCTGATTTAACa^TATTTAACGCGAAT^ 

ATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTAT 

TTCACACCGCATACGCGGATCTGCGCAGCACCATGGCCTGAAATAACCTCTGAAAGAGGA 

ACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGTGGAATGTGTGTCAGTTAGGG 

TGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAG 

TCAGCAACCUVGGTGTGGAAAGTCCCCaVGGCTCCCCAGCAGGCAGAAGTATGa^ 

CATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACT 

CCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAG 

GCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGC 

CTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAGTCTCGAACTTAAGGCTAGAG 

CCUVCCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCaSCTTG 

TATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGC 

TGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATG 

AACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAG 

CTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGG 

GGCAGGATCTCCTGTCa^TCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATG 

CaUVTGCGGCGGCTGCATACGCTTGATCCGGCTACCrnSCCCATTCGACCACC^ 

ATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGG 

ACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGC 

CCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGG 

AAAATGGCCGCTTTTCTGGATTCS^TCGACTGTGGCCGGCTGGGTC 

AGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACC 

GCTTCCTCGTCCTTTACGGTATCGCCGCTCCCGATTCGCAGCGC^^ 

TTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCC 

CAACCTGCCATCACGATGGCCGCAATAAAATATCTTTATTTTCATTACATCTCTGTGTTG 

GTTTTTTGTGTGAATCGATAGCGATAAGQATCCGCGTATGQTGCACTCTCAGTACAATCT 

GCTCTGATGCCGCATAGTTAAGCCAQCCCCGACACCCGCCAACACCCGCTGACGCGCCCT 

GACGGQCTTQTCTGCTCCCGGCATCCGCTTACyiGACa^GCTGTGACCGTCT 

GCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTG^ 

TACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCA 

CTITTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTA^ 

TGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCa^TAATATTGAAAAAGGAAGA 

GTATGAGTATTC2^CATTTCa3TGTCGCCCTTATTCCCTTTTT^ 

CTGTTTTTGCTCACCCa^GAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTG 

CACGAQTGGGTTACATCGAACTGGATCTCyVACAGCGGTAAGATCCTTGAGAGTTTTCGCC 

CCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTAT 

CCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACT 

TGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAAT 

TATGCykGTGCTGCCATAACOVTGAGTGATAACACTGCGGCCaiACTTACTTCTGAC^ 

TCGGAGGACCGAAGGAGCTAACCGCTTTTTTGO^CAACATGGGGGATCATGTAACTCGCC 

TTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGA 

TGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTT^ 

CTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCT^ 

GCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGT 

CTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCT 

ACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTG 

CCTCy\CTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGAT 

ATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCA 

TGAC(:aVAA?^TCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGi^^ 

TCS^GGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGOU^ 

AACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGA 

AGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGT 

TAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGT 

TACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGAT 

AGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACG6GGGGTTCGTGCACACAGCCCAGCT 

TGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCA 

CGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAG 

AGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTC 

GCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGA 

AAAACGCCAGCyVACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCr 

TGGCTCGACAGATCT 
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SKQ ID No. S - LpESYNGPRRE 

TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCa^Ti^TCyUlTATT 

TTGGCCa^TTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGG^ 

AATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGG 

GTCATTA6TTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCC 

GCCTGGCTGACCGCCCAACQACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGCCAATAGGGACTTTCCATTGACGTO^TGGGTGGAGTATTTACGGTAAACTOT 

CCaCTTGGCAGTAOlTaiAGTGTATCaTATGCCy^GTCCGCCCCCTATTGAOT 

CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 

GCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 

CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCC^ 

a^TGGGAGTTTGTTTTGGO^CCAAAATCaACGGGACTTTCC^^ 

CGATCGCCCGCCCCGTTGACGCAAATGGGCX3GTAGGCGTGTACGGTGGGAGGTCTATATA 

AGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGCGGTAGTTTATCAC 

AGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAA(^GTCTCGAACTTAAGCTGCAGT 

GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATC^ 

GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCWGTCGAGACAGAGAAGACT 

CTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGAa^TCCACTOT 

AGGTGTCCACTCCGAGTTC2^TTACAGCTCTTAAGGCTAGAGTACTTAATACGACTCACT 

ATAGGCTAGAGAATTCGAGAGGGGCGCAGACCCTACCTGTTGAACCTGGCTGATCGTAGG 

ATCCCCGGGACAGCAGAGGAGAACTTACAGAAGTCTTCTGGAGGTGTTCCTGGCCAGAAC 

ACyiGGAGGACAGGTAAGATGGGCGATCCCCTCACCTGGTCCaU^GCCCTGAAQAA^ 

AAAAAGTCACCGTT(a.GGGTAGayVAAAGCTTACCACAGGC^ 

CCCTGGTGGATCTTTTCCaiCGAaVCTAATTTCGTTAAGG?lGAAAGATTGGa\ACT 

ACGTGATCCCCCTCTTGGAGGACGTGACCCS^CATTGTCTGGGCAGGAGCGCGAAGCTT 

TCGAGCGCACCTGGTGGGCOVTO^CGCAGTCAAAATGGGGCTGCAAATCAAC^ 

TTGACGGTAAAGCTAGCTTTCAACTGCTCCGCGCTAAGTACGAGAAGAAAACCGCCAACA 

AGAAACAATCCGAACCTAGCGAGGAGTACCCAATTATGATCGACGGCGCCGGCAATAGGA 

ACTTCCGCCGACTGACTCCCAGGGGCrATACCACCTGGGTCaU^CaiCC^ 

GACTTTTGAACGAAGCCTCC»GAACCroTTCGGCATCCTGTCTGTG 

AAGAAATGAATGCTTTTCTCGACGTGGTGCGAGGACAGGCTGGACAGAAACAGATCCTGC 

TCGATGCCATTGACAAGATCGCCGACGACTGGGATAATCGCCACCCCCTGCCAAACGCCC 

CTCTGGTGGCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGGACTGG 

GGGTGCCCCGCGAACGCCAGATGGAGCO^GCaVTTTGACaVATTTAGGCAGACCTACa^G^^ 

AGTGGATCATCGAAGCCATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCAAGGCAC 

AGAACATCAGGC».GGGGGCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTTCTGTCCC 

AGATTAAATCCGAAGGCCACCCTCAGGAGATCTCCAAGTTCTTGACAGACACACTGACTA 

TCCAAAATGCAAATGAAGAGTGCAGAAACGCCATGAGGCACCTCAGACCTGAAGATACCC 

TGGAGGAGAAAATGTACGCATGTCGCGACyVTTGGCACTACCAAGCAAAAGATGATGCTGC 

TCGCCAAGGCTCTGCAAACCGGCCTGGCTGGTCCATTCM^GGAGGAGCACTGAAGGGAG 

GTCCATTGAAAGCTGOVCAAACATGTTATAATTGTGGGAAGCCAGGACATTTATCTAGTC 

AATGTAGAGCACCTAAAGTCTGO-ITTAAATGTAAACAGCCTGGACATTTCTCAAAGCAAT 

GCAGAAGTGTTCCAAAAAACGGGAAGCAAGGGGCTCAAGGGAGGCCCCAGAAACAT^C^^ 

TCCCGATACAACAGAAGAGTCAGCACAACAAATCTGTTGTACAAGAGACTCCTCAGACTC 

AAAATCTGTACCCAGATCTGAGCGAAATAAAAAAGGAATACAATGTCAAGGAGAAGGATC 

AAGTAGAGGATCTCAACCTGGACAGTTTGTGGGAGTAACATACUVATCTCGAGAAGAGG^^ 

CACTACCATCGTCCTGATCmTGACACCCCTCTTAATGTGCTGCTGGACACCGGAGC 

CACCAGCGTTCTCACTACTGCTCACTATAACAGACTGAAATACAGAGGAAGGAAATACCA 

GGGCACAGGCATCATCGGCGTTGGAGGCAACGTCGAAACCTTTTCCT^CTCCTGTCAC^^ 

CAA?UAGAAGGGGAGACACATTAAAACCAGAATGCTGGTCGCCGACATCCCCGTCACCAT 

CCTTGGCAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTGTCTAA 

GGAAATCAAGTTCCGCAAGATCGAGCTGAAAGAGGGCACT^TGGGTCCAAAAATCCCCCA 

GTGGCCCCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCCTGCTTTC 

TGAGGGCAAGATTAGCGAGGCCAGCGACAATAACCCTTACAACAGCCCCATCTTTGTGAT 

TAAGAAAAGGAGCGGCAAATGGAGACTCCTGCAGGACCTGAGGGAACTCAACAAGACCGT 

CCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAATGCAA 

GCyVCATGACAGTCCTTGACTVTTGGAGACGCTTATTTTACCATCrc 

TCGCCCCTATACTGCTTTTACCATCCCCAGCATCAATCACCAGGAGCCCGATAAACGCTA 
TGTGTGGAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGAAGACACT 
TOUlGAGATCCTCCa^CCTTTCCGCGAAAGATACCCyVGAGGTTCAACTCTACCyUlT^ 
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GGACGACCTGTTCATGGGGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCATCATCGA 

ACTGAGGGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAAGAAGT 

TCCTCCATATAGCTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTCCAGAAGAT 

GCAGTTGGATATGGTCAAGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGCAATAT 

TACCTGGATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGC^CTAGAAA 

AGGATGCCTGGAGTTGAACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGGAACT 

GGAGAATAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAAGAAGA 

aatgttgtgcgaggtcgaaatcactaagaactacgaagccacctatgtcatcaaacagtc 

ccaaggcatcttgtgggccggaaagaaaatcatgaaggccaacaaaggctggtcca^ccgt 

taaaaatctgatgctcctgctccagcacgtcgccaccgagtctatcacccgcgtcggcaa 

gtgccccaccttgaaagttcccttcactaaggagcaggtgatgtgggagatgca^^ 

ctggtactactcttggcttcccgagatcgtctacacccaccaagtggtgcacgacgactg 

gagaatgaagcttgtcgaggagcccactagcggaattacaatctataccgacggcggaaa 

gcaaaacggagagggaatcgctgca^tacgtcacatctaacggccgcaccaagcyuv^^ 

gctcggccctgtcactcaccaggtggctgagaggatggctatccagat6gcccttgagga 

cactagagao^gcaggtgaacattgtgactgacagctactactgctggaaaaac^ 

agagggccttggcctggagggaccccagtctccctggtggcctatcatccagaatatccg 

cgaaaaggaaattgtctatttcgcctgggtgcctggacacaaaggaatttacggcaacca 

actcgccgatgaagccgcg^u^ttaaagaggaaatcatgcttgcctaccagggcac^ 

gattaaggagaagagagacgaggacgctggctttgacctgtgtgtgccatacgacatcat 

gattcccgttagcgacaca^gatcy^ttca^ccgatgta^gatccaggtgcc^ 

ttovtttggttgggtgaccggtvaagtccagcatggctaagcyvgggtcttctgat^ 

gggaatcattgatgaaggatacaccggcgaaatccaggtgatctgcacaaat^ 

aagcaatattaagcttatcgaagggcagaagttcgctcaactcatcatc^ 

cagcaatto^gacaaccttgggacgaaaacaagattagccagac^ 

cggcagcaovggtgtgttctggatggagaacatccaggaagcaavggacgag^^ 

ttggcacacctcccctaagattttggcccx3g2^ttac:s^a!^ 

gcagatcacacaggaatgcccccactgcaccaaacaaggttctggccccgccggctgcgt 

gatgaggtccccct^tcactggcaggcagattgcacccacctcgaavact^a^ 

gaccttcgtggagagcaattccggctacatccacgcau^caercctctccaaggaaaatc^ 

attgtgcacctccctcgcaattctggaatgqgccaggctgtoctctca^^ 

caccgaca^cggolccaactttgtggctgaaccrgtggtgaatctgctqaag 

aatcgcccacaccsvctggcattccctatcaccctgaaagccagggcattgtcgagagggc 

caacagaactctgaaag;^2uu^gatccaatctcacagagac?uvtacac^ 

cgcacttcagctcgcccttatcacctgcaacaaaggaagagaaagcatgggcggccagac 

cccctgggaggtcttcatcactaaccaggcccaggtcatccatgaaaagctgctcttgca 

gcaggcccagtcctcau^aaagttctgcttttataagatccccggtgagcacgactggaa 

aggtcctacaagaqttttgtggaaagga6acggcgcagttgtggtgaacgatgagggcaa 

GGGGATCATCGCTGTGCCCCTGACaLCGCACCau^GCTTCTCyiTCAAGCC^^ 

ACGAATCCCAGGGGGAATCTO^CCCCTATTACCCAACAGTCAGAAAAATCrAAGTG 

GGAGAACACAATGTTTCAACCTTATTGTTATAATAATGACAGTAAGAACAGCATGGCAGA 

ATCGAAGGAAGCAAGAGACCAAGAAATGAACCTGAAAGAAGAATCTAAAGAAGAAAAAAG 

AAGAAATGACTGGTGGAAAATAGGTATGTTTCTGTTATGCTTAGCCAGGGCCCTCTGGAA 

GGTGACCAGTGGTGCAGGGTCCTCCGGCAGTCGTTACCTGAAGTU^AAAATTC^ 

ACATGCATCGCGAGAAGAOICCTGGGACCAGGCCCAACACAACATACACC^^^ 

GACCGGTGGATCAGGGGACAAATACTACAAGCAGAAGTACTCCAGGAACGACTGGAATGG 

AGAATCAGAGGAGTACAACT^GGCGGCCAAAGAGCTGGGTGAAGTCAATCGAGGCA 

AGAGAGCTATATTTCCGAGAAGACCAAAGGGGAGATTTCTCAGCCTGGGGCGGCTATCAA 

CGAGCACy^GAACGGCTCTGGGGGGAACAATCCTCACOUVGGGTCCTTAG^ 

TCQAAGCGAAGGAGGAAACATTTATGACTGTTGCATTAAAGCCCyVAGAAGGAACTCTCG^ 

TATCCCTTGCTGTGGATTTCCCTTATGGCTATTTTGGGGGTCGGGGCGGCCGCTTCCCTT 

TAGTGAGGGTTAATGCTTCGAGCAGACATGATAAGATACATTGATGAGXTTGGACAAACC 

ACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTA 

TTTGTAACCATTATAAGCTGCAATAAACAAGTTAAOUVaU^CAATTGC^ 

TTTCAGGTTCa^GGGGGAQATGTOGGAGGTTTTTTAAAGC^GTAAAACCrCTAC^^ 

GGTAAAATCCGATAAGGATCGATCCGGGCTGGCGTAATAGCGAAGAGGCCCGCACCGATC 

GCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGACGCGCCCTGTA6CGGCGCATT 

AAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGC 

GCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCA 

AGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGAGCTTTACGGCACCTCGACC6 

CAAAAAACTTGATTTGGGTGATGGTTCaVCGTAGTGGGCCATCGCCCTGATAGACGGTTTT 
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TCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAAC 

AACACTCAACCCTATCTCQGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGC 

CTATTGGTTAAAAAATGAGCTGATTTAACAAATATTTAACGCGAATTTTAACAAAATATT 

AACGTTTACAATTTCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCAC 

ACCGCATACGCGGATCTGCGCAGCACCATGGCCTGAAATAACCTCTGAAAGAGGAACTTG 

GTTAGGTACCTTCTOAGGCGGAAAGAACCAGCTGTGGAATGTGTGTCAGTTAGGGT^ 

AZ^GTCCCC?IGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATG(:7VTCTCAATT^^ 

AACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGC^ 

Cy^TTAGTCy^GCAACCATAGTCCCGCCCCTAACTCOSCCCATCCCGCCCCTAACTC 

O^GTTCCGCCCATTCTCCGCCCCaLTGGCTGACTAATTTTTTTTATrrATGCAGA^ 

GGCTOCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGC 

CTTTTGCAAAAAGCTTGATTCOTCTGACyvCAACAGTCTCGAACTTAAGGCTA 

ATGATTGAACAAGATGGATTGCACGa^GGTTCTCCGGCCGCTTGGGTGGAGAGGCTAO^ 

GGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTO 

GCGCAGGGGCGCCCGGTTCnTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATQAACTO 

aVGGACGAGGCy^GCGCGGCTATCGTGGCTGGCaiCGACGGGCGTTCCTTGCGaVGCTGTQ 

CTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAG 

GATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATG 

CGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGC 

ATCGAGCX5AGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAA 

GAGCATaVGGGGCTCGCGCCAGCCGAACTGTTC?GCCAGGCTCAAGGCQCGCAT^ 

GGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGGAAAAT 

GGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGAC 

ATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTC 

CTCGT6CTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATC6CCTTCTT 

GACGAGTTCTTCTGAGCGGGACTCIXSGGGTTCGAAATGACCGACCAAGCGACGCCC^ 

TGCCATCaVCGATGGCCGa^TAAAATATCTTTATTTTCaiTTACATCTC 

TTGTGTGAATCGATAGCGATAAGQATCCGCGTATGGTGCACTCTCAGTACAATCTGCTCT 

GATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGG 

GCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATG 

TGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGC 

CTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGAOSTCAGGTGGa^CTTTT 

CGGGGAAATGTGCGCGGAACCCCTATTTGrrTATTTTTCT 

CCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAZVAAGGAAGAGTATG 

AGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCy^TTTTGCCTTCCT^ 

TTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGA 

GTGGGTTACATCGAACTGGATCTCAAOVGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAA 

GAACGTTTTCCyVATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGT 

ATTGACGCCGGGCa^GAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTT 

GAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGC 

AGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGA 

GGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGAT 

CGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACQAGCGTGACy^CaiCaATGCC^ 

GTAGGAATGGCAACAACGTTGCGCAAACTATTAACTGGCQAACTACTTACTCT^^ 

CG6CAACAATTAATAGACTGGATGGAGGCGQATAAAGTTGCAGGACCACTTCTGCGCTCG 

GCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTCGC 

GGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACG 

ACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCA 

CTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCyVTATATACTTTAGATTGATTTA 

AAACTTO^TTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTG 

AAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAA 

GGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCA 

CCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCa^CTCTTTTTCCGAAGGTA 

ACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGC 

CACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCr^ 

GTGGCTGCTGCCAGTGGCGATAA6TCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTA 
CCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAG 
CGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTT 
CCCG7^GGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCX3GAACAGGAGAGCGC 
ACGAGGGAGCTTCC:a.GGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCAC 
CTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCT 
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GCa^GOU^CGCGGCCrTTTTACGGTTCCTGGCCTTTTGCTGGCCTTOT^ 
CGACAGATCT 



4^ 
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CTAAATTGTIU^GCGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTC 

ATTTTTTAACCAATAGGCCGTWVTCGGOUWVATCCCTTATAAATCAAAAGAATAGAC 

GATAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTC 

CAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACC 

CTAATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAG 

CCCCCGATTTAGAGCTTGACGGGGAAAGCCAACCTGGCTTATCQAAATTAATACGACTCA 

CTATAGGGAGACCGGCAGATCTTGAATAATAAAATGTCT6TTT6TCCGAAATACGCGTTT 

TGAGATTTCT6TCGCCQACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCGATAGA 

GATGGCGATATTGGAAAAATTGATATTTGAAAATATGGCATATTGAAAAT6TCGCCGATG 

TGAGTTTCTGTGTAACTGATATCGC<^TTTTTCCAAAAGTQATTTTTGGGCATACGCGAT 

ATCTGGCGATAGCGCTTATATCGTTTACGGGGQATGGCGATAGACGACTTTGGTSACTTG 

GGCGATTCTGTGTGTCGOUATATCGCAGTTTCGATATAGGTGACAGAaSATATGAGGCT 

ATATCGCCGATAGAGGCGACATCAAGCTGGCACATGGCCyUVTGCATATCGATCTATACAT 

T6AAT(»ATATTQGCCATTAOCa^TATTAlTCaTTGGTTATATAGCATAAATCAATATTG 

GCTATTGGCCATTGCATACGTTGTATCCATATCGTAATATGTACATTTATATTGGCTCAT 

GTCCS^CATTACCGCCATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTA 

CGGGGTCaTTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATG 

GCCCGCCTGGCTGACCGCCO^CXSACCCCCGCCCATTGACGTCAATAATGACGTATGTTC 

CCATAGTAACGCCaATAGGGACTTTCCS^TTGACGTCAATGGGTGGAGTATTTACGGTAAA 

CT^CCCaCTTGQCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCA 

ATGACGQTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTA 

CTTGGCAGTACATCTACGTATTA6TaVTCGCTATTACCATGGTOATGaWITTTa<KaU3 

ACACaUVTGGGCGTGGATAGCG6TTTGACrra^GGGGATTTCCSUW3TCTC(^^ 

ACGTCAATGGGAGTTTGTTTTGGCACOUiAATCaACGQGACTTTCa^T^^ 

ACTGCGATCQCCCGCCCCaTTOACGCaAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTA 

TATAAGCAaRSCTCGTTTAGTOAACCGQGCACTCAGATTCTGCGGTCTGAGTCCCTTCTC 

TGCTGGGCTGAAAAGGCCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAQT 

TTOTCTGTTCGAGATCCTAOlGTTGGCGCCCGAACAGGGACCTOASAGGGGCGCAfaACCC 

TACCTGTTGAACCTGGCTGArCGTAGGATCCCCaGGACAQa^GAQGASAACTTACASAAG 

CaTQQAGCAAGaCGCTCAAflAAGTTAGAGAAGQTGACGGTACAAGGGTCTCAGAAATTAA 

CTACTGQTAACTGTAATTGGGCGCTAAGTCTAGTAGACTTATTTCATGATACCAACTTTG 

TAAAAQAAAAGGACTGGCAGCTGAGGGATGTCATTCCATTGCTGGAAGATGTAACTCAQA 

CGCTGTCAGGACAAGAAAGAGAGGCCTTTGAAAGAACATOGTCGGCAATTTCTQCTOT^ 

AGATGGGCCTCCAGATTAATAATGTAGTAOATGGAAAGGCATCATTCCaQCTCCTAAGAG 

CGAAATATGAAAAGAAGACTQCTAATAAAAAGCAGTCTGAGCCCTCrraAAGAATATCTCT 

AGAACTAGTGGATCCCCCGQGCTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTC6AQ 

GTOGATCCGGCO^TTAGCCaTAlTATTCATTGGTTATATAGCATAAATCAA^^^^ 

TTGGCCATTGCATACGTTGTATCCATATCATAATATGTAO^TTTATATTGGCTCATGTCC 

AACATTACCGCCATGTTGAOVTTGATTATTGACTAGTTATTAATAGTAAlXaATTACGGG 

Ar™f^^^''^'^^'^''^''^^^*^^^'^^<3TCAATAATQACGTATGTTC 
JSS^S^^'''^'''^^°^^®^^°"^^'^<^'^°GA6TATTTACGG^^ 
SS^!f^^'''^'''^''''*'^^^^^^^°C^QTACGCCCCCTATTGACGTC^TG^ 
aSGTAAATGGCCCGCCTGGOlTTATGCCa^GTACATGACCTTATGGGACTTTCCTACTTG 

S^^^^nr^'''^'''^^^°^^^^^°°<^^CCAAGTCTCC^^ 

CAATGGGafiTTTGTTTTGGOVCCAAAATCS^CGGGACTTTCa^TGTCGTAACAAOT 

»^^™^^^^'^°°°'=®^'^^<^^'^°TACGGTGQGAGGTCTATATAAGCAGAGC 

TCGTTTAGTGAACCQTCAGATCGCCTGGAGACGCCATCCAC6CTGTTTT6ACCTCCATAG 

SJSn^n^*^'^^^^^^^°^^^^^^<^°CCCCAAGCTOCAGCTGCTC6A^^^ 

GCCA6CTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGC 
rJ?JtSn^n^''°°^®^"^^°^'^''°°^^^^^^C^^<5CGGTGCCG 
CTGGAGT6CGATCTTCCTGAGGCC6ATACTGTCGTOGTCCOTrC3VAACTGGaM3ATGCAC 
GGTTACGATGCGCCCATCTACACCAACGTAACCTATCCCATTACGGTCAATCCGCCGTTT 

CTACAGGAA6GCCAGACGCGAATTATTTTT6ATGGCGTTAACTCGGCGTTTCATCTGTGG 
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TGCAACGGGCGCTGGGTCGGTTACGGCCy^GGACy^GTCGTTTGCCGTCTGAATTTGAC^^ 

AGCGCATTTTTACGCGCCGGAGAAAACCGCCTCGCGGTGATGGT6CTGCGTTGGAGTGAC 

GGCAGTTATCTGGAAGATCAGGATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCTCG 

TTGCTGCATAAACCGACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGAT 

GATTTCAGCCGCGCTGTACTGGAGGCTGAAGTTCAGATGTGCGGCGAGTTGCGTGACTAC 

CTACGGGTAACAGTTTCTTTATGGCAGGGTGT^CGCAGGTCGCCAGCGGCACCGCGCCT 

TTCGGCGGTGAAATTATCGATGAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTG 

AACGTCGAAAACCCGAAACTGTGGAGCGCCGAAATCCCGAATCTCTATCGTGCGGTGGTT 

6AACTGCACACCGCCGACGGCACGCTGATTGAAGCAGAAGCCTGCGATGTCGGTTTCCGC 

GAGGTGCGGATTGAAAATGGTCTGCTGOTGCTGAACGGCa^CCGTTGCTGATTCGAGGC 

GTTAACCGTCSICGAGCATCATCCTCTGCATGGTCAQGTCATGGATGAGCAGACGATGGTG 

CAGGATATCCTGCTGATGT^GCAGAACAACTTTAACGCCGTGCGCTGTTCGCATTATCCG 

AACCATCCGCTGTGGTACACGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCC 

AATATTGAAACCCACGGCATGGTGCCAATGAATCGTCTGACCGATGATCCGCGCTGGCTA 

CC6GCGATGAGCGAACGCGTAACGCGAATG6TGCAGCGCGATCGTAATCACCCGAGTGTG 

ATCATCTCGTCGCTGGGGAATGAATCAGGCCy^CGGCGCTAATCACGACGCGCTO 

TGGATCAAATCTGTCGATCCTTCCCGCCCGGTGCAGTATGAAGGCGGCGGAGCCGACACC 

ACGGCCACC6ATATTATTTGCCCGATGTACGCGCGCGTGGATGAAGACCAGCCCTTCCCG 

GCTGTGCCQAAATGGTCCATCAAAAAATGGCTTTCGCTACCTGGAGAGACGCGCCCGCTG 

ATCCTTTGCGAATACGCCCACGCGATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTGG 

CAGGCGTTTCGTO^TATCCCCGTTTACaVGGGCGGCTTCQTerGGGAC^ 

TCGCTGATTAAATATGATGAAAACGGCAACCCGTGGTCGGCTTACGGCGGTGATTTTGGC 

GATACGCCGAACGATCGCCAGTTCTGTATGAACGGTCTGGTCTTTGCCGACCGCACGCCG 

CATCCAGCGCTGACGGAAGCAAAACACCAGCAGCAGTTTTTCCAGTTCCGTTTATCCGGG 

CAAACCATCGAAGTGACCAGCGAATACCTGTTCCGTCATAGCGATAACGAGCTCCTGCAC 

TGGATGGTGGCGCTGGATGGTAAGCCGCTGGCAAGCGQTGAAQTGCCTCTGGATGTCGCT 

CCTLCAAGGTAAACAGTTOATTGAACTGCCTGAAC^ 

CTCTGGCTOVCAGTACGCGTAGTGCAACCGAACGCGACCGCATGGTaVGAAGCCGGGCAC 

ATCAGCGCCTGGCAGCAGTGGCGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCG 

TCCCACGCCATCCCGCATCTGACCACCAGCGAAATGGATTTTTGCATCGAGCTGGGTi\AT 

AAGCGTTGGCAATTTAACCGCCAGTCAGGCTTTCTTTCACAGATGTGGATTGGCGATAAA 

AAACAACTGCTGACGCCGCTGCGCGATCAGTTCACCCGTGCACCGCTGGATAACGACATT 

GGCGTAAGTGAAGCGACCCGCSVTTGACCCTAAOSCCTGGGTCGAACGCnX^^ 

GGCCATTACCAGGCCGAAGCAGCGTTGTTGCAGTGCACGGCAGATACACTTGCTGATGCG 

GTGCTGATTACGACCGCTCACGCGTGGCAGCATCAGGGGAAAACCTTATTTATCAGCCGG 

AAAACCTACCGGATTGATGGTAGTGGTCAAATGGCGATTACCGTTGATGTTGAAGTGGCG 

AGCGATACACCGCATCCGGCGCGGATTGGCCTGAACTGCCAGCTGGCGCAGGTAGCAGAG 

CGGGTAAACTGGCTCQGATTAGGGCCGCAAGAAAACTATCCCGACCGCCTTACTGCCGCC 

TGTTTTGACCGCTGGGATCTGCCATTGTCaVGACATGTATACCCCGTACGTCTTCCCGAGC 

GAAAACGGTCTGCGCTGCGGGACGCGCGAATTGAATTATG6CCCACACCAGTGGCGCGGC 

gacttccagttcaacatcagccgctacagtcaacagcaactgatgga;^ 

catctgctgcacgcggaagaaggcacatggctgaatatcgacggtttccatatggggatt 

ggtggcgacgactcctggagcccgtcagtatcggcggaattccagctgagcqccggtcgc 

taccattaccagttggtctggtgtcaaaaataataataaccgggcaggggggatccgcag 

atccggctgtggaatgtgtgtcagttagggtgtggaaagtccccaggctccccagcaggc 

AGAAGTATGCAAAGCATGCCTGCAGGAATTCGATATCAAGCTTATCGATACCGTCGACCT 

cgagggggggcccggtacccagcttttgttccctttagtgagggttaattgcgcgggaag 

TATTTATCACTAATCAAGCACAAGTAATACATGAGAAACTTTTACTACAGCAAGCACA^ 

cctccaaaaaattttgtttttacaaaatccctggtgaacatgattggaagggacctacta 
gggtgctgtggaagggtgatggtgcagtagtagttaatgatgaagga7vagggaataattg 
ctgtaccattaaccaggactaagttactaataaaaccaaattgagtattgttgcaggaag 

CAAGACCCAACTACCATTGTCAGCTGTGTTTCCTGAGGTCTCTAGGAATTGATTACCTCG 

ATGCTTCATTAAGGAAGAAGAATAAACAAAGACTGAAGGCAATCCAACAAGGi^ 

CTCAATATTTGTTATAAGGTTTGATATATGGGAGTATTTGGTAAAGGGGTAACATGGTCA 

GCATCGCATTCTATGGGGGAATCCCAGGGGGAATCTCAACCCCTATTACCCa^Cai^ 

AAAAATCTAAGTGTGAGGAGAACACAATGTTTCAACCTTATTGTTATAATAATGACAGTA 

AGAACAGCATGGCAGAATCGAAGGAAGCAAGAGACCAAGAAATGAACCTGAAAGAAGAAT 

CTAAAGAAGAAAAAAGAAGAAATGACTGGTGGAAAATAGGTATGTTTCTGTTATGCTTAG 

CAGGAACTACTGGAGGAATACTTTGGTGGTATGAAGGACTCCCACAGCAACATTATATAG 

GGTTGGTGGCGATAGGGGGAAGATTAT^CGGATCTGGCCAATCAAATGCTATAGAATGCT 

GGGGTTCCTTCCCGGGGTGTAGACCyiTTTCAAAATTACTTCAGTTATGAGACCAATAG 
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GCATGCATATGGATAATAATACTGCTACATTATTAGAAGCTTTAACCAATATAACTGCTC 

TATAAATAACAAAACAGAATTAGAAACATGGAAGTTAGTAAAGACTTCTGGCATAACTCC 

rrTACCTATTTCTTCTGAAGCTAACACTGGACTAATTAGACATAAGAGAGATTTTGGTAT 

AAGTGCAATAGTGGCAGCTATTGTAGCCGCTACTGCTATTGCTGCTAGCGCTACTATGTC 

TTATGTTGCTCTAACTGAGGTTAACAAAATAATGGAAGTACaU^TCATACTTTT^ 

AGAAAATAGTACTCTAAATGGTATGGATTTAATAGAACGAO^TAAAGATATTATATGC 

TATGATTCTTCAAACACATGCAGATGTTCAACTGTTAAAGGAAAGACAACAGGT^ 

GACATTTAATTTAATTGGATGTATAGAAAGAACACATGTATITrTGTCyiTACTGGTC^ 

CTGGAATATGTCATGGGGACATTTAAATGAGTCAACACTUVTGGGATGACTGGGTAAGC;^ 

AATGGAAGATTTAAATCSU^GAGATACTAACTACACTTCATGGAGCCAGGAACy^ 

ACAATCCATGATAACATTCAATACa^CCAGATAGTATAGCTC7\ATTTG<^^ 

GAGTCATATTGGAAATTGGATTCCTGGATTGGGAGCTTCCATTATAAAATATATAGTGAT 

GTTTTTGCTTATTTATTTGTTACTAACCTCTTCGCCTAAGATCCTCAGGGCCCTCTGGAA 

GGTGACCAGTGGTGCAGGGTCCTCCGGCAGTCGTTACCTGAAGAAAAAATTCCATCACy^ 

ACATGCATCGCGAGAAGACACCTGGGACCAGGCCCAACACAACATACACCTAGC^ 

GACCGOTGGATCaVGGGGACAAATACTACAAGCAGAAGTACTCCAGGAACGACTGGAATGG 

AGAATCAGAGGAGTACAACAGGCGGCCAAAGAGCTGGGTGAAGTCAATCGAGGCATTTGG 

AGAGAGCTATATTTCCGAGAAGACCAAAGGGGAGATTTCTCAGCCTGGGGCGGCTATCAA 

CGAGCACAAGAACGGCTCTGGGGGGAACAATCCTCACCAAGGGTCCTTAGACCTGGAGAT 

TCGAAGCGAAGGAGGAAAOVTTTATGACTGTTGCATTAAAGCCCAAGAAGGAACTCTCGC 

TATCCCnTOCTGTGGATTTCCCTTATGGCTATTTTGGGGACTAGTAATTATAGTAGGACG 

CATAGCAGGCTATGGATTACGTGGACTCGCTGTTATAATAAGGATTTGTATTAGAGGCTT 

A/^TTTGATATTTGAAATAATCy^GAAAAATGCTTGATTATATTGGAAGAGCTTTAAATCC 

TGGCACATCTCATGTATCAATGCCTCAGTAT6TTTAGAAAAACAAGGGGGGAACTGTGGG 

GTTTTTATGAGGGGTTTTATAAATGATTATAAGAGTAAAAAGAAAGTTGCTGATGCTCTC 

ATAACCTTGTATAACCCa^GGACTAGCTCATGTTGCTAGGCAACTAAAC 

GCATTTGTGACGCGAGTTCCCCATTGGTGACGCGTTAACTTCCTGTTTTTACAGT^ 

AGTGCTTGTATTCTGACAATTQGGCACTCAGATTCTGCGGTCTGAGTCCCTTCTCTGCTG 

GGCTGAAAAGGCCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAGTTTGTC 

TGTTCGAGATCCTACAGAGCTCATGCCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTG 

TGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCaTAAAGTC 

GCCTGGGGTGCCTAATGAGTGAGCTAACTCACSlTTAATTGCGTTGCGCTCACTaCCCGCT 

TTCO^TCGGGAAACCTGTCGTGCGAGCTGCATTAATGAATCGGCCAACGTO 

GGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTC 

GTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAA 

TCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCA^ 

AAAAA6GCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAA 

AATCGACGCTCAAGTCyWSAGGTGGCGAAACCCGAOUSGACTATAAAG^ 

CCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTG 

TCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTC 

AGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCC 

GACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTA 

TCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCG 

ACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTAOICTAGAAGGACAQTATTTGGTATC 

TGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGO^ 

CAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAA 

AAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACOAA 

AACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTT 

TTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGAC 

AGTTACC^ATGCTTAATCAGTGAGGCACCTATCTCaGCGATCTGTCTATTTCGTTCATCC 

ATAGTTGCCTGACTCCCCGTCGTGTAGATAACTAC6ATACGGGAGGGCTTACCATCTGGC 

CCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATA 

AACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATC 

CAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGC 

AACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCA 

TTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAA 

GCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCA 

CTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTT 

TCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGT 

TGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAT^AAGTG 

CrCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGA 



24 



wo 01/79518 PCT/GBOl/01784 



TCCAGTTCGATGTAACCCACTCGTGCACCCaU^CTGATCTTa^GCATCTTT^^ 

AGCGTTTCTGGGTGAGa^AAAA(^GGAAGGCAAAATGCCGC^^ 

ACACGGAAATGTTQAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTT^ 

GGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAA^^ 

GTTCCGCGCACATTTCCCCGAAAAGTGCCAC 
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SBQ ID No. lo - pONYS.OZ 

AGATCTTGAATT^TAAAATGTGTGTTTGTCCGAAATACGCGTTTTGAGATTTCTGTCGCC 

GACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCGATAGAGATGGCGATATTGGAA 

AAATTGATATTTGAAAATATGGCATATTGAAAATGTCGCCGATGTGAGTTTCTGT6TAAC 

TGATATCGCCATTTTTCCAAAAGTQATTTTTGGGCATACGCGATATCTGGCGATAGCGCT 

TATATCGTTTACGGGGGATGGCGATAGACGACTTTGGTGACTTGGGCGATTCTGTGTGTC 

GC7VAATATCGCAGTTTCGATATAGGTGACAGACGATATGAGGCTATATCGCCGATAGAGG 

CGACy^TCAAGCTGGCACATGGCCAATGCATATCQATCTATACATTGAATCAATATTO^ 

ATTAGCCATATTATTCATTGGTTATATAGOVTAAATCAATATTGGCTATTGGC^ 

TACGTTGTATCCATATCGTTATATGTACATTTATATTGGCTCaVTGTCCAACyVTTA 

ATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCArrAGTTC^ 

TAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCT6ACC 

GCCCAACGACCCCCGCCCATTGACGTCAATAATQACGTATGTTCCCATAGTAACGCOUVT 

AGGGACTTTCCATTGACGTCAATGGGTCGAGTATTTACGGTAAACTGCCCAC^^ 

AaVTO^GTOTATO^TATGCCAAQTCCGCCCCCTATTGACGTCaATGACGGTAAA 

CGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTGGCAGTACATCTA 

CGTATTAGTCTVTCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCGTGG 

ATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTaUlTGGaAQTTT 

GTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAAa^ 

CCGTTQACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGT 
TTAGTGAACCGGGCACTCAGATTCTGCGGTCTGAGTCCCTTCTCTGCTGGGCTGAAAAGG 
. CCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAGTTTGTCTGTTCGAGATC 
CTACAGTTGGCGCCCGAACAGGGACCTGAGAGGGGCGCAGACCCTACCTGTTGAACCTGG 
CTGATCGTAGGATCCCCGG6ACAGCAGAGGAGAACTTACAGAAGTCTTCTGQAGGTGTTC 
CTGGCCAGAACACAGGAGGAOVGOTAAGATTGGGAiGACCCTTTGAaiTTGC^^ 
CTCSVAGAAGTTAGAGAAQGTGACGGTACAMGGTCTCAGAAATT^ 
AATTGGGCGCTAAGTCTAGTAGACTTATTTCATGATACCAACTTTGTA 
TGGCT^GCTGAGGGATGTCTVTTCCATTGCTGGAAGATGTAACTCAGACGCTGTCaGGAC^ 
QAAAGAGAGGCCTTTGAAAGAACATGGTGGGCAATTTCTGCTGTAAAGATGGGCCTCCAG 
ATTAATAATGTAGTAGATGGAAAGGCATCATTCO^CTCCTAAGAGCOAAATATC 
AAGACTGCTAATAAAAAGOlGTePGAGCCCTCTGAAGAATATCTCTAGAACTAGTGGATC 
CCCCGGQCTGCAGGAGTGQGGAGGCACGATG6CCGCTTTGGTCGAGGCGGATCCGGCCAT 
TAGCa^TATTATTOVTTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCATA 
CGTTGTATCCATATCATAATATGTACATTTATATTGGCTCATGTCCAACATTACCGCCAT 
GTTGACATTGATTATTGACTAGTTATTAATA6TAATCAATTACGGGGTCATTAGTTCATA 
GCCCyVTATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGC 
CCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAG 
GGACTTTCCATTGACGTOU^TGQGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTAC 
ATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCG 
CCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACG 
TATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGAT 
AGCGGTTTGACT(^CGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTC 
TTTGGCACCAAAATCAACGGGACTTTCO^AAATGTCGTAACAACTCCGCC 
AAATGGGCGGTAGGCATGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACC 
GTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCGGGACC 
GATCCAGCCTCCGCGGCCCCAAGCTTCAGCTGCTCGAGGATCTGCGGATCCGGGGAATTC 
CCCAGTCTCTVGGATCCACCy^TGGGGGATCCCGTCGlTTTACT^CGTCGTGACTG^ 
CCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCaLGCT 
TAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACJVGTTGCGCy^GCCTGAATGGCGAATG 
GCGCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCTGGCTGGAGTGCGATCT 
TCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACGGTTACGATGCGCC 
CATCTACACCAACGTAACCTATCCCATTACGGTCAATCCGCCGTraGTTCCCACGGAG^ 
TCCGACGGGTTGTTACTCGCTCACATXTAATGTTGATGAAAGCTGGCTACAGGAAGGCCA 
GACGCGAATTATTTTTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCT^^ 
GGTCGGTTACGGCCAGGACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGCATTTTTACG 
CGCCGGAGAAAACCGCCTCGCGGTGATGGTGCTGCGTTGGAGTGACGGCAGTTATCTGGA 
AGATCAGGATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCTCGTTGCTGCATAAACC 
GACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGATGATTTCA 
TGTACTGGAGGCTGAAGTTCAGATGTGCGGCQAGTTGCGTGACTACCTACGGGTAACAGT 
TTCTTTATGGCAGGGTGAAACGCAGGTCGCCAGCGGCyVCCGCGCCTTTCGGCGG 
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TATCGATGAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTGAACGTCGAAAACCC 

GAAACTGTGGAGCGCCGAAATCCCGAATCTCTATCGTGCGGTGGTTGAACTGCACACCGC 

CGACGGCACGCTGATTGAAGCAGAAGCCTGCGATGTCGGTTTCCGCGAGGTGCGGATTGA 

AAATGGTCTGCTGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGCGTTAACCGTCACGA 

GCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGCAGGATATCCTGCT 

QATGAAGCaiGAACSUlCTTTAACGCCGTGCGCTGTTCGCSVTTATCCGAACC^^ 

GTACACGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCCAATATTGAAACCCA 

CGGCATGGTGCCAATGAATCGTCTGACC6ATGATCCGCGCTGGCTACCGGCGATGAGCGA 

ACGCGTAACGCGAATGGTGCAGCGCGATCGTAATCACCCGAGTGTGATCATCTGGTCGCT 

GGGGAATGAATCAGGCCACGGCGCTAATCACGACGCGCTGTATCGCT6GATCAAATCTGT 

CGATCCTTCCCGCCOSGTGCAGTATGAAGGCGGCGGAGCCGACACCACGGCCACCG^ 

TATTTGCCCGATGTACGCGCGCGTGGATGAAGACCAGCCCTTCCCGGCTGTGCCGAAATG 

GTCCATCAAAAAATGGCTTTCGCTACCTGGAGAGACGCGCCCGCTGATCCTTTGCGAATA 

CGCCCACGCGATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTG6CAGGCGTTTCGTCA 

GTATCCCCGTTTACAGGGCGGCTTCGTCTGGGACTGGGTGGATCAGTCGCTGATTAAATA 

TGATGAAAACGGCaUlCCCGTGGTCGGCm^ACGGCGGTGATTTTGGCGATACGCCGA^^ 

TCGCCAGTTCTGTATGAACGGTCTGGTCTTTGCCGACCGCaiCGCCGa\TCCAGC 

GGAAGCAAAACyiCCAGCAGCyVGTTTTTCCAGTTCCGTT^^ 

GACCAGCGAATACCTGTTCCGTCATAGCGATAACGAGCTCCTGCACTGGATGGTGGCGCT 

GGATGGTAAGCCGCTGGCAAGCGGTGAAGTGCCTCTGGATGTCGCTCCACAAGGTAAACA 

GTTGATTGAACTGCCTGAACTACCGOVQCCGGAGAGCGCCGQGCa^CTCTGGCT 

ACGCGTAGTGCy^CCGAACGCGACCGCATGGTCAQAAGCCGGGCACATCy^ 

GCAQTGGCGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCCC 

GCATCTGACCyVCCAGCGAAATGGATTTTTGCATCGAGCTGGGTAATAAGCGTTGGCAAT^ 

TAACCGCCAGTCAGGCTTTCTTTCACAGATGTGGATTGGCGATAAAAAACAACTGCTGAC 

GCCGCTGCGCGATCAGTTCACCCGTGCACCGCTGGATAACGACATTGGCGTAAGTGAAGC 

GACCCGCATTGACCCTAACGCCTGGGTC6AACGCTGGAAGGCGGCGG6CCATTACCAGGC 

CGAAGCT^GCGTTGTTGOIGTGCACGGCAGATACACTTGCTGATGCGGTGCTGAOT 

CGCTCyVCGCGTGGCAGOVTCAGGGGAAAACCTTATTTATCAGCCGGAAAACCTACCGGAT 

TGATGGTAGTGGTCAAATGGCGATTACCGTTGATGTTGAAGTGGCGAGCGATACACCGCA 

TCCGGCGCGGATTGGCCTGAACTGCCAGCTGGCGCAGGTAGCAGAGCGGGTAAACTGGCT 

CGGATTAGGGCCGOUVGAAAACTATCCCGACCGCCTTACTGCCGCCTGTTTTGACCGCTG 

GQATCTGCCATTGTCAQACaiTGTATACCCCGTACGTCTTCCCGAGCGAAAACGGTCTGCG 

CTGCGGGACGCGCGAATTGAATTATGGCCCACACCAGTGGCGCGGCGACTTCCy^GTTCA^ 

a^TCAGCCQCTACAGTaWlCAGCa^CTGATGGAAACCAGC 

GGAAGAAGGCACATGGCTGAATATCGACGGTTTCCATATGGGGATTGGTGGCGACGACTC 

CTGGAGCCCGTCAGTATCGGCGGAATTCCAGCTGAGCGCCGGTCGCTACCATTACCAGTT 

GGTCTGGTGTCAAAAATAATT^TAACCGGGCAGGGGGGATCCGCAGATCCGGCTGTGGAA 

TGTGTGTO^GTTAGGGTGTGGAAAGTCCCCTVGGCTCCCCAGCy^GGa^GAAGTATGCa^ 

CATGCCTGCAGGAATTC6ATATCAAGCTTATCQATACCGTCGACCTCGAQGGGQGGCCCG 

GTACCCAGCTTTTGTTCCCTTTAGTGAGGGTTAATTGCGCGGGAAGTATTTATCACTAAT 

a\AGCACAAGTAATACATGAGAAACTTTTACTACAGCy^GCACAAT^ 

TGTTTTTACAAAATCCCTGGTGAACATGATTGGAAGGGACCTACTAGGGTGCTGTGGAAG 

GGTGATGGTGCAGTAGTAGTTAATGATGAAGGAAAGGGAATAATTGCTGTACCATTAACC 

AGGACTAAGTTACTAATAAAACCa^TTGAGTATTGTTGCAGGAAGCAAGACCCAACTAC 

CaiTTGTCatGCTGTGTTTCCTGACCTCAATATTTGTTATAAGGTTTGATATGAATCCC^ 

GGGAATCTCAACCCCrATTACCC?^(::AGTCAGAAAAATCTAAGTGTGAGGAGAAC^ 

GTTTCAACCTTATTGTTATAATAATGACAGTAAGAACAGCATGGCAGAATCGAAGGAAGC 

AAGAGACCAAGAATGAACCTGAAAGAAGAATCTAAAGAAGAAAAAAGAAGAAATGACTGG 

TGGAAAATAGGTATGTTTCTGTTATGCTTAGCAGGAACTACT6GAGGAATACTTTGGTGG 

TATGAAGGACTCCCACAGCTVACATTATATAGGGTTGGTGGCGATAGGGGGAAGATTJ^ 

GGATCTGGCCAATCAAATGCTATAGAATGCTGGGGTTCCTTCCCGGGGTGTAGACCATTT 

CAAAATTACTTCAGTTATGAGACCAATAGAAGCATGCATATGGATAATAATACTGCTACA 

TTATTAGAAGCTTTAACCAATATAACTGCTCTATAAATAACAAAACAGAATTAGAAACAT 

GGAAGTTAGTAAAGACTTCTGGCATAACTCCTTTACCTATTTCTTCTGAAGCTAAC^ 

GACTAATTAGACATAAGAGAGATTTTGGTATAAGTGCAATAGTGGCAGCTATTGTAGCCG 

CTACTGCTATTGCTGCTAGCGCTACTATGTCTTATGTTGCTCTAACTGAGGTTAACAAAA 

TAATGGAAGTACAAAATCATACTTTTGAGGTAGAAAATAGTACTCTAAATGGTATGGATT 

TAATAGAACGACAAATAAAGATATTATATGCTATGATTCTTCTU^CACATGCAGATGr^ 

AACTGTTAAAGGAAAGACa^CTVGGTAGAGGAGACATTTAATTTAATTGGATGT^ 

GAACACATGTATTTTGTCATACOXSGTCATCCCTGGAATATGTCATGGGGACTITTTAT^ 
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AGTCAACAO^TGGGATGACTGGGTAAGCAAAATGGAAGATTTAAATCAAGAGATACT 

CTACACTTCATGGAGCCAGGAACAATTTGGCACAATCCATGATAACATTCAATACACCAG 

ATAGTATAGCTCAATTTGGAAAAGACCTTTGGAGTCATATTGGAAATTGGATTCCTGGAT 

TG6GAGCTTCCATTATAAAATATATAGTGATGTTTTTGCTTATTTATTTGTTACTAACCT 

CTTCGCCTAAGATCCTCAGGGCCCTCTG6AAGGTGACCAGT6GTGCAGGGTCCTCCGGCA 

GTCGTTACCTGAAGAAAAAATTCCATCACAAACATGCATCGCGAGAAGACACCTGGGACC 

AGGCCCAACACAACATACACCTAGCAGGCGTGACCGGTGGATCAGGGGACAAATACTACA 

AGCAGAAGTACTCCAGGAACGACTGGAATGGAGAATCAGAGGAGTACAACAGGCGGCCAA 

AGAGCTGGGTGAAGTCAATCGAGGCATTTGGAGAGAGCTATATTTCCGAGAAGACC?UU^ 

GGGAGATTTCTCAGCCTGGGGCGGCTATCAACGAGCACAAGAACGGCTCTGGGGGGAACA 

ATCCTOVCCAAGGGTCCTTAGACCTGGAGATTCGAAGCGAAGGAGGAAACATTTATGACT 

6TTGCATTAAAGCCCAAGAAGGAACTCTCGCTATCCCTTGCTGTGGATTTCCCTTATGGC 

TATTTTGGGGACTAGTAATTATAGTAGGACGCATAGCAGGCTATGGATTACGTGGACTCG 

CnXSTTATAATAAGGATTTGTATTAGAGGCTTAAATTTGATATTTGAAATAATCa^GA?^^ 

TGCTTGATTATATTGGAAGAGCTTTAAATCCTGGCACATCTCATGTATCAATGCCTt^ 

ATGTCTAGAAAAACAAGGGGGGAACTGTGGGGTTTTTATGAGGGGTTTTATAAATGATTA 

TAAGAGTAAAAAGAAAGTTGCTGATGCTCTCATAACCTTGTATAACCCAAAGGACTAGCT 

aVTGTTGCTAGGCAACTAAACCGC?^TAACCGCATTTGTGACGCGAGTTCCCCATTGGTG 

ACGCGTTAACOTCCTGTTTTTAOUSTATATAAGTGCTTGTATTCTGACAATTGGGCACTC 

AGATTCTGCGGTCTGAGTCCCTTCTCTGCTGGGCTGAAAAGGCCTTTGTAATAAATATAA 

TTCTCTACTCAGTCCCTGTCTCTAGTTTGTCTGTTCGAGATCCTAaVGAGCTC^ 

GGCGTAATCATGGTOVTAGCTGIOTCCTGTGTGAAATTGTTATCCGCTCACAATTCCAC^ 

CAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACT 

CACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCT 

GCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGC 

TTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTQCGGCGAGCGGTATCAGCTCA 

CTOVAAGGCGGTAATACGGTTATCaiCAGAATCAGGGGATAACGCAGGAAAGAACATG^^ 

AGCAAAAGGCCaiGCaUU^GGCCyiGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTO 

TAGGCTCCGCCCCCCTGACGAGCATCACT^AAAATCGACGCTCAAGTCAGAGGTGGCGAAA 

CCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCC 

TGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTC6GGAAGCGTGGC 

GCTTTCT.CATAGCTCACGCTGTAGGTATCTCSIGTTCGGTGTAGGTCGTTCGCTCCAAGC^ 

GGGCTGTGTGa^COAACCCCCTCTTCAGCCCGACCGCTQaSCCTTATCaSGTAA 

TCTTGAGTCCaU^CCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTA^ 

6ATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTA 

CGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGG 

AAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTT 

TGTTTGCAAGCAGCAGATTACGOXIAGAAAAAAAGGATCTCAAGAAGATCCTT^ 

TTCTACGGGGTCTGACGCTCAGTGGAAOSAAAACTCACGTTAAGGGATTTTGGTCATGAG 

ATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATC^ 

CTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACC 

TATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGAT 

AACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCC 

ACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAaCCGaAAGGGCCmGCGCA^ 

AAGTGGTCCTGO^CTTTATCCGCCTCCATCCyVGTCTATTAATTGTTGCCGGGAAGCTAG 

AGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGT 

GGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCG 

AGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGT 

TGTCAGAAGTAAGTTGGCCGCAGTGrrATCACTOlTGGTTATGGCAGCACTGCATAATTC 

TCTTACTGTCATGCCATCCGTAAGATGCTTTTCTQTGACTGGTGAGTACTCAACC^ 

ATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAA 

TACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCG 

AAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACC' 

CAACTGATCTTCAGCATCTXrrACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGG^ 

GCAAAATGCCGOVAAAAAGGGAATAAGGGCGACACGQAAATGTTGAATACrCAT^^ 

CCTT'TTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATT 

TGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCC 

ACCTAAATTGTAAGCGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGC 

TCyVTTTTrrAACCAATAGGCCGAAATCGGCAAAATCCCTTATAAATCAAAAGAATAGACC 

GAGATAGGGTTQAGTGTTGTTCCAGTTTGGAAOlAGAGTCCyVCTATTAATVGAACGT^ 

TCCAACGTOUU^GGGCGAAAAACCGTCTATCAGGQCGATGGCCCACTACGTGAACCAT^^ 
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SEQ ZD No. il - pONYS.lZ 

AGATCTTGAATAATAAAATGTGTGTTTGTCCGAAATACGCGTTTTGAGATTTCTGTCGCC 

GACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCGATAGAGATGGCGATATTGGAA 

AAATTGATATTTGAAAATATGGCaiTATTGAAAATGTCGCCGATGTGAGTTTC^ 

TGATATCGCCATTTTTCCAAAAGTGATTTTTGGGCATACGCGATATCTGGCG^ 

TATATCGTTTACGGGGGATGGCGATAGACGACTTTGGTGACTTGGGCQATTCTGTGTGTC 

GCAAATATCGCAGTTTCGATATAGGTGACAGACGATATGAGGCTATATCGCCGATAGAGG 

aSACATOAGCTGGCACATGGCCyUVTGaVTATCGATCTATACATTG 

ATTAGCaVTATTATTCATTGGTTATATAGCATAAATaUVTATTGGCTATTC 

TACGTTCTATCCa^TATCGTAATATGTACy^TTTATATTGGCTaVTGTCCyi^ 

ATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCA 

TAGCCCATATATGGAGTTCCGCQTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACC 

GCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAAT 

AGGGACTTTCCATTGACGTCyUVTGGGTGGAGTATTTAC^ 

ACATCAAGTGTATaATATGCa\AGTCCGCCCCCTATTGAa3TCAATGACGGTA 

CGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTGGCAGTACATCTA 

CGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCGTGG 

ATAGCGGXTTGACTGACGGGGATTTCCAAGTCTCCACCCGATTOACGTCaAT^ 

GTTTTGGCACCAAAATCaVACGGGACTTTCa^AAATQTCGTAAC^ 

CCGTTGACGCAAATGGGCGQTAGGCGTGTACGQTGGGAGQTCTATATAAGCAGAGCTCGT 

TTAGTGAACCGGGCACTCAGATTCTGCGGTCTGAGTCCCTTCTCTGCTGGGCTGAAAAGG 

CCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAGTTTGTCTGTTCGAGATC 

CTACAGTTGGCGCCCGAACAGGGACCTGAGAGGGGCGCAGACCCTACCTGTTGAACCTGG 

CTGATCGTAGGATCCCCGGGACAGCAGAGGAGAACTTACAGAAGTCTTCTGGAGGTGTTC 

CTGGCCAGAA<:ava^GGAGGAa\.GGTAAGArrGGGAGACCCTTTGACATT^ 

CTCAAGAAGTTAGAGAAGGTGACGGTACT^GGGTCTCAGAAATTAACTACTGGTAACTGT 

AATTGGGCGCTAAGTCTAGTAGACTTATTTCATGATACCAACTTTGTAAAAGAAAAGGAC 

TGGCAGCTGAGGGATGTCATTCCATTGCTGGAAGATGTAACTCAGACGCTGTCAGGACAA 

GA^^GAGAGGCCTTTGAAAGAACATGGTGGGCAATTTCTGCTGTAAAGaLTGGGCCTCC^ 

ATTAATAATGTAGTAGATGGAAAGGCATa^TTCCAGCTCCTAAGAGCGA^ 

AAGACTGCTAATAAAAAGCaGTCTGAGCCCTCTQAAGAATATCTCTAGAACTAGTGGATC 

CCCC6GGCTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTCGAGGCGGATCCGGCCAT 

TAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCATA 

CGTTGTATCCy^TATCATAATATGTACATTTATATTGGCTCATGTCOWVCATTACCGCC^ 

GTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATA 

GCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGC 

CCAACGACCCCCGCCCATTGACGTCAATAATGACQTATGTTCCCATAGTAACGCCAATAG 

GGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTAC 

ATCAA6TGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCG 

CCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACG 

TATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGAT 

AGCGGTTTGACTCACGGGGATTTCCau^GTCTCCACCCCATTGAGGTCAATGGGAGTTTGT 

TTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCQTAACAACTCCGCCCCATTGACGC 

AAATGGGCGGTAGGCATGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACC 

GTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCGGGACC 

6ATCCAGCCTCCGCGGCCCCAAGCTTCAGCTGCTCGAGGATCTGCGGATCCGGGGAATTC 

CCCAGTCTCAGGATCCACCATGGGGGATCCCGTCGTTTTACAACGTCGTGACTGGGAAAA 

CCCTGGCGTTACCCAACTTAATCGCCTTGCAGCaVCaiTCGCCCTTTCGCCAG 

TAGCGAAGAGGCCCGCACCGATCGCCCTTCCa^CyVGTTGCGCAGCCTGTVATGGCGJ^ 

GCGCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCTGGCTGGAGTGCGATCT 

TCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACGGTTACGATGCGCC 

CATCTACACCAACGTAACCTATCCCATTACGGTCAATCCQCCGTTTGTTCCCACGGAGAA 

TCCGACGGGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACAGGAAGGCCA 

GACXaCGAATTATTTTTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCAACGGGCGCTG 

GGTCGGTTACGGCCAGGACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGCATTTTTACG 

CGCCGGAGAAAACCGCCTCGCGGTGATGGTGCTGCQTTGQAGTGACGGCAGTTATCTGGA 

AGATCAGGATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCTCGTTGCTGCATAAACC 

GACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGATGATTTCAGCCGCGC 

TGTACTGGAGGCTGAAGTTCAGATQTGCGGCGAGTTGCGTGACTACCTACGGGTAACAGT 

TTCTTTATGGCAGGGTGAAACGCa^GGTCGCCTVGCGGCACCGCGCCTTTCGGCGGTGA^ 
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TATCGATGAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTGAACGTCGAAAACCC 

GAAACTGTGGAGCGCCGAAATCCCGAATCTCTATCGTGCGGTGGTTGAACTGCACACCGC 

CGACGGCACGCTGATTGAAGCAGAAGCCTGCGATGTCGGTTTCCGCGATO 

AAATGGTCTGCTGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGCGTTAACCGTCACGA 

GCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGCAGGATATCCTGCT 

GATGAAGCAGAACAACTTTAACGCCGTGCGCTGTTCGCATTATCCGAACCATCCGCTGTG 

GTACACGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCCAATATTGAAACCCA 

CGGCATGGTGCCAATGaATCGTCTGACCGATGATCCGCGCTGGCTACC^ 

ACGOGTAACGCGAATGGTGCAGCGCGATCGTAATCACCCGAGTGTGATCATCTGGTCGCT 
GGGGAATGAATCAGGCCACGGCGCTAATCACGACGCGCTGTATCGCTGGATCAAATCTGT 
CGATCCTTCCCGCCCGGTGOVGTATGAAGGCGGCGGAGCCGACACCACGGCCACCGATAT 
TATTTGCCCGATGTACGCGCGCGTGGATGAAGACCAGCCCTTCCCGGCTGTGCCGAAATG 
GTCCATCAAAAAATGGCTTTCGCTACCTQGAGAGACGCGCCC6CTGATCCTTTGCGAATA 
C6CCCACGCQATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTGGCAGGCGTTTCGTCA 
GTATCCCCGTTTACAGGGCGGCTTCGTCTGGGACTGGGTGGATCAGTCGCTGATTAAATA 
TGATGAAAACGGCAACCCGTGGTCGGCTTACGGCGGTGATTTTGGCGATACGCCGAACGA 
TCGCCAGTTCTGTATGAACGGTCTGGTCTTTGCCGACCGCyVCGCCGaVTCCAGCGCTGAC 
GGAAGaVAAACavca^GCAGO^GTTTTTCCaVGTTCC^ 

GACCAGCGAATACCT6TTCCGTCATAGCGATAACGAGCTCCTGCACTGGATGGTGGCGCT 
GGATGGTAAGCCGCTG6CAAGCGGTGAAGTGCCTCTGGATGTCGCTCC7VCAAGGTAAACA 
GTTQATTGAACTGCCTQAACTACCGCAGCCGGAGAGCGCCGGGCAACTCTGGCTCACAGT 
ACGCGTAGTGCAACCGAACGCGACCGCATGGTCAQAAGCCGGGCAa^TCAGCGCCTGGCA 
GCAGTGGCGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCCC 
GCATCTGACCACCAGCOAAATGGATTTTTGCATCGAG 

TAACCGCCAGTCAGGCTTTCTTTCAaVGATGTGGATTGGCGATAAAAAAO^ 

GCCGCTGCGCQATCAGTTGACCCGTGCACCGCTGGATAACGACaVTTGGCGTAAGTGAAGC 

GACCCGCATTGACCCTAACGCCTGGGTCGAACGCTGGAAGGCGGCGGGCCATTACCAGGC 

CGAAGCAGCGTTGTTGCAGTGCACGGCMATAaVCTTGCTGATGCGGTGCTGATTAC^^ 

CGCTCACXSCGTGGCAGCATCAGGGGAAAACCTTATTTATCAGCCGGAAAACCT^^ 

TGATGGTAGTGGTCAAATGGCQATTACCGTTGATGTTGAAGTGGCGAGCGAT^ 

TCCGGCGCGGATTGGCCTGAACTGCCAGCTGGCGCAGGTAGCAGAGCGGGTAAACTGGCT 

CQGATTAGGGCCGCAAGAAAACTATCCCGACCGCCTTACTGCCGCCTGTTTTGACCGCTG 

GGATCTGCCATTGTCAGACATGTATACCCCGTACGTCTTCCCGAGCGAAAAC6GTCTGCG 

CTGCGGGACGCGCGAATTGAATTATGGCCCyVCACCAGTGGCGCGGCGACTTCCAGTTC^ 

CATCAGCCGCTACAGTCAACAGCAACTGATGGAAACCAGCCa^TCGCCATCTGC^ 

GGAAGAAGGCACATGGCTGAATATOSACGGTTTCCATATGGGGATTGGTGGCGACGACTC 

CTGGA6CCCGTCAGTATC?GGCGQAATTCCAGCTGAGCGCCGGTCGCTACCATTACCAGTT 

GGTCTGQTGTCAAAAATAATAATAACCGGGCAGGGGGGATCCGCAGATCCGGCTGTGGAA 

TOTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTA^^ 

CATGCCTGCAGGAATTCGATATCAAGCTTATCGATACCX3TCGAATTGGAAGAGCTTTAAA 

TCCTGGCACATCTCATGTATCAATGCCTCAGTATGTTTAGAAAAACAAGGGGGGAACTO^ 

GGGGTTTTTATGAGGGGTTTTATAAATGATTATAAGAGTAAAAAGAAAGTTGC^^ 

CTCATAACCTTGTATAACCO^GGACTAGCTCATGTTGCTAGGCAACTAAACCGCAATA 

ACCGCATTTGTGACGCGAGTTCCCCATTGGTGACGCGTTAACTTCCTGTTTTTACAGTAT 

ATAAGTGCTTGTATTCTGACAATTGGGCACTCAGATTCTGCGGTCTGAGTCCCTTCTCTQ 

CTGGGCTGAAAAGGCCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCTCTAGTTT 

GTCTGTTCGAGATCCTACyiGAGCTCa.TGCCTTGGCGTAATa^TGGTCATAGCTGTTTCCT 

GTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGT 

AAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCC 

GCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGG 

AGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCG 

GTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCG6TAATACGGTTATCCACA 

GAATCAGGGGATAACGO^GGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCC^^ 

CGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCAC 

AAAAATCGACGCTCAAGTCA6AGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCG 

TTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATAC 

CTGTCCGCCTTTCTCCCTOCGGGAAGCGTGGCGCTTTCTCy^TAGCTavCGCTGTAGGTAT 

CTCAGTTC6GTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAG 

CCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGAC 

TTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGA6GTATGTAGGCGGT 

GCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGT 
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ATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGC 

AAAAT^GGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAAC 

GAftAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTC^ 

CTTTTAAATTAAAAATGAAGTTTTAAATOVATCTAAAGTATATATGAGTAAACTTGGTCT 

GACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGra 

TCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCT 

GGCCCCAGTGCTGCAATGATACCGCGAQACCCAC6CTCACCGGCTCCAGATTTATCAGCA 

ATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGO^CTTTATCCGCCTCC 

ATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTG 

CGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCT 

TCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGAT^ 

AAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTA 

toictcatggttatggcs^gcactgcataattctcttactgtca^tgccsvt^ 
ttttctgtgactggtgagtactouvccaagtcattctgagaatagtgtatgcggcgaccg 
agttgctcttgcccggcgtcauvtacgggataataccgcgccyvcatagaigaactttaaaa 
gtgctcatc:attgga?u^cgttcttcggggcgaaaactctcaaggatcttaccgct 

AGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTAC 

accagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgc?^^^ 

GCGACACGGAAATGTTGAATACTCy^TACTCTTCCTTTTTCaATATTATTGAAGC^ 

Ca^GGGTTATTGTCTCMGAGCGGATACATATTTGAATGTATTTAGAAAAATAAAC^^ 

GGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTAAATTGTAAGCGTTAATATTTTGT 

TAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCT^TAGGCCGAAAT^ 

GCAAAATCCCTTATAAATCAAAAGAATAGACCGAGATAGGGTTGAGTGTTGTTCa\GTTT 

GGAAC^AGAGTCCACTATTAAAGAACGTGGACTCOU^CGTCAAAGGGCGAAAAACCGT^ 

ATCAGGGCGATGGCCCACTACGTGAACCaiTCACCCTAATCAAGTTTTTO 

GCCGTAAAGCACrAAATaSGAACCCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGQAA 

AGCCAACCTGGCTTATCGAAATTAATACGACTCACTATAGGGAGACCGGC 
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SEQ ID No. ll* - pONYS.l 

AGATCTTCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATAT 

TGGCTATTGGCCATTGCATACGTTGTATCTATATCa^TAATATGTACyVTTTATATTGGCTC 

ATGTCCAATATGACC6CCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAAT 

TACGGGGTCATTA6TTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAA 

TGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTAT6T 

TCCCATAGTAACGCCa^TAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTA 

AACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGT 

CAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCC 

TACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCA 

GTAOlCaUVTGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCa^ 

TGACGTOU^TGGGAGTTTGTTTTGGCACa^AAATC^ 

CaiACTGCGATTCCCCGCCCCGTTGACGCA^TGGGCGGTAGGCGTGTACGGTGGGAGGTC 

TATATAAGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGCGGTAGTT 

TATCACAGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAAC^ 

TGCAGTGACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGT^ 

TATO^GGTTAOFAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGAC^ 

AAGACTCTTGCXSTTTCTGATAGGCACCTATTGGTCTTACTGACATCCAC^^^ 

CTCCACAGGTGTCCACTCCCAGTTCAATTACAGCTCTTAAGGCTAGAGTACTTAATACGA 

CTCACTATAGGCTAGCCTCGAGGTCGACGGTATCGCCCGAACAGGGACCTGAGAGGGGCG 

CAGACCCTACCTGTTGAACCTGGCTGATCGTAGGATCCCCGGGACAGCAGAGGAGAACTT 

AC^GAAGTCTTCTGGAGGTGTTCCTGGCCAGAACACAGGAGGACAGGTAAGATGGGAGAC 

CCTTTGACATGGAGCAAGGCGCTCAAGAAGTTAGAGAAGGTGACGGTACaAGGGTCT 

AAATTAACTACTGGTAACTGTAATTGGGCGCTAAGTCTAGTAGACTTATTTCATGATACC 

AACTTTGTAAAAGA7^AAGGACTGGCaiGCTGAGGGATGTCA.TTCCATTGCTGGAAGATGTA 

ACTCAGACGCTGTCAGGACAAGAAAGAGAGGCCTTTGAAAGAAOVTGGTGGGCAATTTCT 

GCTGTAAAGATGGGCCTCCAGATTAATAATGTAGTAGATGGAAAGGCATCATTCCAGCTC 

CTAAGAGC6AAATATQAAAAGAAQACTGCTAATAAAAA6CAGTCTGAGCCCTCTGAAGAA 

TATCCAATCATGATAGATGGGGCTGGAAACAGAAATTTTAGACCTCTi^ 

TATACTACTTGGGTGAATACCATACAGACAAATGGTCTATTAAATGAAGCTAQTC^AA^ 

TTATTTGGGATATTATCAGTAGACTGTACTTCTGAAGAAATGAATGCATTTTTGGATGTG 

GTACCTGGCCy^GGCT^GGACAAAAGCAGATATTACTTGATGCAATTGATAAGATAGC^ 

GATTGGGATAATAGACATCCATTACCGAATGCTCCACTGGTGGCACCACCACAAGGGCCT 

ATTCCCATGACAGCAAGGTTTATTAGAGGTTTAGGAGTACCTAGAGAAAGACAGATGGAG 

CCTGCTTTTGATCAGTTTAGGO^GACATATAGACAATGGATAATAGAA 

GGCATCAAAGTGATGATTGGAAAACCTAAAGCTCAAAATATTAGGCAAGGAGCTAAGGAA 

CCTTACCCAGAATTTGTAGACAGACTATTATCCCAAATAAAAAGTGAGGGACATCCACAA 

GAGATTTCAAAATTCTTGACTGATACACTGACTATTCAGAACGCAAATGAGGAATGTAGA 

AATGCTATGAGACATTTAAGACCAGAGGATACATTAGAAGAGAT^TGTATGCTTGCAGA 

GACATTGGAACTACAAZU^CaVAT^GATGATGTTATTGGCAAAAGCACT 

GCGGGCCCATTTAAAGGTGGAGCCTTGAAAGGAGGGCCACTAAAGGCAGCACAAACATGT 

TATAACTGTGGGAAGCCAGGACATTTATCTAGTCAATGTAGAGCACCTAAAGTCTGTTTT 

AAATGTAAACAGCCTGGACATTTCTCAAAGCAATGCAGAAGTGTTCCAAAAAACGGGAAG 

CAAGGGGCTCAAGGGAGGCCCCAGAAACAAACTTTCCCGATACAACAGAAGAGTCA^ 

AACAAATCTGTTGTACAAGAGACTCCTOlGACTCaVAAATCTGTACCCAGATCTGAGa^ 

ATAAT^AAAGGAATAOWVTGTCAAGGAGAAGGATCS^GTAGAGGATCTa^ 

TTGTGGGAGTAACy^TATAATCTAGAGAAAAGGCCTACTACAATAGTATTAATTAATO 

CTCCCTTAAATGTACTGTTAGACACAGGAGCAGATACTTCAGTGTTGACTACTGCACATT 

ATAATAGGTTAAAATATAGAGGGAGAAAATATCAAGGGACGGGAATAATAGGAGTGGGAG 

GAAATGTGGAAACATTTTCTACGCCTGTGACTATAAAGAAAAAGGGTAGACy^CATTA^ 

CAAGAATGCTAGTGGCAGATATTCCAGTGACTATTTTGGGACGAGATATTCTO 

TAGGTGCAAAATTGGTTTTGGCACAQCTCTCCAAGGAAATAAAATTTAGAAAAA^^ 

TAAAAGAGGGCACAATGGGGCCAAAAATTCCTCAATGGCCACTCACTAAGGAG^^ 

AAGGGGCCAAAGAGATAGTCCAAAGACTATTGTCAGAGGGAAAAATATCAGAAGCTAGTG 

ACAATAATCCTTATAATTCACCCATATTTGTAATAAAAAAGAGGTCTGGCAAATGGAGGT 

TATTACAAGATCTGAGAGAATTAAACAAAACAGTACAAGTAGGAACGGAAATATCCAGAG 

GATTGCCTCACCCGGGAGGATTAATTAAATGTAAACACATGACTGTATTAGATATTGGAG 

ATGCATATTTCACTATACCCTTAGATCCAGAGTTTAGACCATATACAGCTTTCACTATTC 

CCTCCATTAATCATCAAGAACCAGATAAAAGATATGTGTGGAAATGTTTACCACAAGGAT 

TCGTGTTGAGCCCATATATATATCAGAAAACATTACAGGAAATTTTACAACCTTTTAGGG 
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AAAGATATCCTGAAGTACAATTGTATOUVTATATGGATQATrTGTTO^TGGGAAGTAATC 

GTTCTAAAAAACAACACAAAGAGTTAATCATAGAATTAAGGGCGATCTTACTGGAAAAG6 

GTTTTGAGACACCAGATGATAAATTACAAGAAGTGCCACCTTATAGCTGGCTAGGTTATC 

AACTTTGTCCTGAAAATTGGAAAGTACAAAAAATGCAATTAGACATGGTAAAGAATCC^ 

CCCTTAATGATGTGCAAAAATTAATGGGGAATATAACATGGATGAGCTCAGGGATCCC^ 

GGTTGACAGTAAAACACATTGCAGCTACTACTAAGGGATGTTTAGAGTTGAATCAAAA^ 

TAATTTGGACGGAAGAGGCAOUU^GAGTTAGAAGAAAATAATGAGAAGATTAAAAATC 

CTCAAGGGTTACAATATTATAATCCAGAAGAAGAAATGTTATGTGAGGTTGAAATTAC^ 

AAAATTATGAGGCAACTTATGTTATAA?VACAATCACAAGGAATCCTATGGGCAGGTAAAA 

AGATTATGAAGGCTAATAAGGGATGGTO^CAGTAAAAAATTTAATGTTATTGTTGCAAC 

ATGTGGCAACAGAAMTATTACTAGAGTAGGAAAATGTpCAACGTTTAAGGTACCAT^ 

CCaAAGAGCAAGTAATGTGGGAAATGCAAAAAGGATGGTATTATTCTTGGCTCCC^ 

TAGTATATAOVCU^TCAAGTAGTTCATGATGATTGGAGAATGAAATTGGTAGAAGAACCTA 

CATCAGGAATAACAATATACACTGATGGGGGAAAACAAAATGGAGAAGGAATAG^^ 

ATGTGACCy^GTAATGGGAGAACTAAACAGAAAAGGTTAGGACCTGTCACTa^TCAAGTTG 

CTGAAAGAATGGCAATACMVATGGO^TTAGAGGATACCS^GAGATAAAa^A 

TAACTGATAGTTATTATTGTTGGAAAAATATTACTWaAAGGATTAGGTTTAGAAGGACC^^ 

AAAGTCCTTGGTGGCCTATAATACAAAATATACGAGAAAAAGAGATAGTTTATTTTGCTT 

GGGTACCTGGTCACAAAGGGATATATGGTAATCAATTGGCAGATGAAGCCGCAAA^ 

AAGAAGAAATCATGCTAGCa^TACCT^GGCAGACAAATTAAAGAGAAAAGAG^^ 

CAGGGrrTGACTTATGTGTTCCTTATGACATCATGATACCTGTATCTGACyvai^^ 

TACCCACAGATGTAAAAATTCT^TTCCTCCTAATAGCTTTGGATGGGTCACTGGGA^ 

CATCAATGGCaUU\ACAGGGGTTATTAATTAATGGAGGAATAATTGATGAAGGATAT^ 

GAGAAATACAAGTGATATGTACTAATATTGGAAAAAGTAATATTAAATTAATAGAGGGAC 

AAAAATTTGCACAATTAATTATACTACAGCATCACTCAAATTCCA^^ 

aaaataaaatatctcagagaggggataaaggatttggaagtacaggagtattctgagtag 

aaaatattcaggt^gca^caagatgaacy^tgagaattggcatacatcacc^^ 

caagaaattataagataca^ttgactgtago^aaactvgataactcaagaatgtcctc^ 

qcactaagcaaggatcaggacctgcaggttgtgtcatgagatctcctaatcattggca^ 

caqattgcacacatttggacaataagataatattgacttttgtagagtca;^ 

acatacatgctacavttattgtcaaaagaaaatg(^ttatgtacttcattggctatto^ 

AATGGGO^GATTGTTTTl^CC^^GTCCTTACaa^CaiGATAACGG^^ 
CAGAACCAGTTGTAAATTTGTTGAAGTTCCTAAAGATAGCACATACCACAGGAATAC^^ 
ATCATCCAGAAAGTCAGGGTATTGTAGAAAGGGCAAATAGGACCTTGAAAGAGAAGATTC 
AAAGTCATAGAGAOU^CACTCAAAOlCrGGAGGCAGCTTTAC^^ 

GTAAC7VAAGGGAGGGAAAGTATGGGAGGACAGACACCATGGGAAGTATTTATCACTAATC 

AAGCACAAGTAATACATGAGAAACTTTTACTACAGCAAGCACAATCCTCCAAAAAATTT^ 

GTTTTTACAAAATCCCTGGTGAACATGATTGGAAGGGACCTACTAGGGTGCT6TGGAAGG 

GTGATGGTGCAGTAGTAGTTAATGATGAAGGAAAGGGAATAATTGCTGTACCATTAACCA 

GQACTAAGTTACTAATAAAACCAAATTGAGTATTGTTGCAGGAAGCAAGACCCAA 

ATTGTCy^GCTGTGTTTCCTGAGGTCTCTAGGAATTGATTACCTCGATGCTTCATTAAGGA 

AGAAaAATAAACAAAGACTGAAGGCM.TCCAACAAGGAAGACAACCTCAATATTTGTTAT 

AAGGTTTGATATATGGGAGTATTTGGTAAAGGGGTAACATGGTCAGCATCGCATTCTATG 

GGGGAATCCCT^GGGGGAATCTO^CCCCTATTACCCAACAGTayBAAAAATCTAAGTC^ 

AGGAGAACACAATGTTTCa^CCTTATTGTTATAATAATGACAGT^^ 

AATCGAAGGAAGCAAGAGACCAAGAAATGAACCTGAAAGAAGAATCTAAAGAAGAAAAAA 

gaagaaatgactggtggaaaataggtatgtttctgttatgcttagcaggaactactggag 

gaatactttggtggtatgaaggactcccacagcaacattatatagggttggtggcgatag 

ggggaagattaaacggatctggccaatcaaatgctatagaatgctggggttccttcccgg 

ggtgtagaccatttcaaaattacttcagttatgagacc^atagaagcatc 

ataatactgctacattattagaagctttaaccaatataactgctctataaataacaaaac 

agaattagaaacatggaagttagtaaagacttctggcataactcctttacctatttcttc 

tgaagctaacactggactaattagacataagagagattttggtataagtgcaatagtggc 

agctattgtagccgctactgctattgctgctagcgctactatgtcttatgttgctctaac 

tgaggttaacaaaataatggaagtacau^tcatacttttgaggtagaaaatagtactct 

aaatggtatggatttaatagaacgao^taaagatattatatgctatgattcttcaa^ 

acatgcagatcStto^ctgttai^ggtvaagacaacaggtagaggagacatttaatt^ 

tggatgtatagaaagaacacatgtattttgtcatactggtcatccctggaatatgtcatg 

GGGACATTTAAATGAGTCAACACAATGGGATGACTGGGTAAGCAAAATGGAAGATTTAAA 

TCAAGAGATACTAACTACACTTOVTGGAGCCAGGAACAATTTGGCACaATCC^^ 

ATTCy^TACACCAGATAGTATAGCTO^TTTGGAAAAGACCTTTGGAGTCATAT^ 
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TTGGATTCCTGGATTGGGAGCTTCCATTATAAAATATATAGTGATGTTTTTGCTTATTTA 

TTTGTTACTAACCTCTTCGCCTAAGATCCTCAGGGCCCTCTGGAAGGTGACCAGTGGTGC 

AGGGTCCTCCGGCAGTCGTTACCTGAAGAAAAAATTCO^TaVCAAAC^ 

AGACACCTGGGACCAGGCCCAACACAACATACACCTAGCAGGCGTGACCGGTGGATCAGG 

GGACAAATACTACAAGCAGAAGTACTCCAGGAACGACTGGAATGGAGAATCAGAGGAGTA 

CAACAGGCGGCCAAAGAGCTGGGTGAAGTCAATCGAGGCATTTGGAGAGAGCTATATTTC 

CGAGAAGACa^AAGGGGAGATTTCTCAGCCTGGGGCGGCTATCAACGAGCAaVAGAACGG 

CTCTGGGGGGAACAATCCTa\.CCAAGG6TCCTTAGACCTGGAGATTCGAAGCGA^ 

AAACATTTATGACTGTTGCATTAAAGCCCAAGAAGGAACTCTCGCTATCCCTTGCT 

ATTTCCCTTATGGCTATTTTGGGGACTAGTAATTATAGTAGGACGCATAGCAGGCTATGG 

ATTACGTGGACTCGCTGTTATAATAAGGATTTGTATTAGAGGCTTAAATTTGATATTTGA 

AATAATCAGAAAAATOCTTGATTATATTGGAAGAGCTTTAAATCCTGGCACATCTCATGT 

ATOUITGCCTCAGTATGTTTAGAAAAACAAGGGGGGAACTGTGGGG 

TTTATAAATGATTATAAGAGTAAAAAGAAAGTTGCTGATGCTCTCATAACCTTGTATAAC 

CCAAAGGACTAGCTCATGTTGCTAGGOVACTAAACCGCAATAACCGCATTTGTGACGCGA 

GTTCCCCATTGGTGACGCGTGGTACCTCTAGAGTCGACCCGGGCGGCCGCTTCCCTTTAG 

TGAGGGTTAATGCTTCGAGCAGACATGATAAGATACATTGATGAGTTTGGACa^CC^ 

ACTAGAATGCAC3TGAAAAAAATGCTTTATTTCTGAAATTTGTGATO 

GTAACCATTATAAGCTGCa^TAAAaUWSTTAACaU^C^ 

aVGGTTQW3GGGGAGATGTGGGAGGTTTTTTAAAGCAAGTAAAACCTCTACAAATC 

AAAATCCGATAAGGATCGATCCGGGCTGGCGTAATAGC6AAGAGGCCCGCACCGATCGCC 

CTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGACGCGCCCTGTAGCGGCGCATTAAG 

CGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACOSCTACACTTGCCy^ 

CGCTCCTTTCGCTTl'CTTCCCTTCCTraCTCXJCCACGTTCGCCGGCTTTCCCCGTC^ 

TCTATVATCGGGGGCTCCCTTTAGGGTTCCGATTTAGAGCTTTACGGCACCTCGACCGC^ 

AAAACn^GATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTCT 

CCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAAC^ 

ACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCQATTTCGGCCTA 

TTGGTTAAAAAATGAGCTGATTTAACaUU^TATTTAACGCGAATTTTAACAA^^ 

QTTTACaATTTCGCCTGATGCXSGTATTTTCTCCTTACGCATCT 

GCATACGCGGATCTGCGCAGCACCATGGCCTGAAATAACCTCTGAAAGAGGAACTTGGTT 
AGGTACCTTCTGAGGCGGAAAGAACCAGCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAA 
GTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAAC 
CAGGTGTGGAAAGTCCCaiGGCTCCCCAGCAGGCaVGAAGTATGCAAAGCATGCATCTC^ 
TTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAG 

CGCCTCGGCCTCTGAGCTATTCCy^GAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTT 

TTGCAAAAAGCTTGATTCTTCTGACACAACAGTCTCGAACTTAAGGCTAGAGCCy^CC^ 

ATTGAACAAGATGGATTGC^CGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGC 

TATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCG 

CAGGGGCGCGCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAG 

GAC3GAGGCAGCGCGGCTATa3TQGCTGGCCACGACGGGCGTTCCTTGCGCAQCTGTGCTC 

GACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGAT 

CTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGG 

CGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATC 

GAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAG 

CATCA(3GGGCTCGCGCCaW3CCGAACTGTTCGCCAGGCTC3^GGCX3TO 

GAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGGAAAATGGC 

CGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACATA 

GCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTC 

GTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGAC 

GAGTTCTTCTGAGCGGGACTCTCGGGTTC6AAATGACCGACCAAGCGACGCCCAACCTGC 

CyVTCACXSATGGCXIGCAATAAAATATCTTTATTTTCATTAC^ 

TGTGAATCGATAGCGATAAGGATCCGCGTATGGTGCACTCTCAGTACAATCTGCTCTGAT 
GCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCT 
TGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGT 
Cy^GAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTA 
TTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGG 
GGAAATGTGCGCGGAACCCCTATTTGTrrATTTTTCTAAATACATTCAAATATGTATCCG 
CTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGT 
ATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTT 
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GCTCa^CCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGA 

GGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAA 

CGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATT 

GACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAG 

TACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATQ^ 

GCTGCaVTAACCATGAGTGATAACACTGCGGCCy^CTTACrTCTGACAACGATTC 

CCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGT 

TGGGAACCGGAGCTGAATGAAGCCATACOy^CGACGAGCGTGACACCACGATGCCTGTA 

GCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACT^^ 

CAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCC 

CTTCCGGCTGGCTGQTTTATTGCTGATTU^TCTGGAGCCGGTGAGCGTGGGTCT^ 

ATCATTGCAG(::a.CTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACy^CGAC 

GGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTG 

ATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAA 

CTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAA 

ATCCCTTAACGTGAGTTTTCQTTCCACTGAGCGTCy^GACCCCGTAGAAAAGATC^ 

TCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCy^CAAAAAAACCACCG 

CTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACT 

GGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCAC 

OICTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTG 

GCTGCTGCCyVGTGGCGATAAGTCGTGTCTTACCGGGTTGGACrca^^ 

GATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCC^ 

AOSACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCC 

GAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACG 

AGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTC 

TGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAAC6CC 

AOCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTOT^ 

C 
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SBQ ID Ko. 12) - pCIxxeoERev 

TGAATAATAAAATGTGTGTTTGTCCQAAATACGCGTTTTGAGATTTCTGTCGCCGACT 

ATTCATGTCGCGCGATAGTGGTGTTTATCGCCGATAGAGATGGCGATATTGGAAAAATTG 

ATATTTGAAAATATGGCATATTGAAAATGTCGCCGATGTGAGTTTCTGTGTAACTGATAT 

CGCCArXTTTCCAAAAGTGATTTTTGGGCATACGCGATATCTGGC 

GTTTACGGGGGATGGCGATAGACGACTTTGGTGACrTGGGCGATTCTGTGTGTCGCy^ 

ATCGCAGTTTCGATATAGGTGACAGACGATAT6AGGCTATATCGCCGATAGAGGCGACAT 

CAAGCTGGCACATGGCCAATGCATATCGATCTATACATTGAATCAATA^^ 

CATATTATTCATTGGTTATATAGCMAAATCAATATTGGCTATTGGCCATTGC^^ 

GTATCCATATCGTAATATGTACATTTATATTGGCTCa^TGTCCAACaiTTACCXSCCATC 

ACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCC 

ATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAA 

CGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGAC 

TTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTAGAT^ 

AGTGTATCy^TATGCCf^GTCCGCCCCCTATTG2VCGTCaATGACGGTAAATGGCrc 

GCATTATGCCa^TACaiTGACCTTACGGGACTTTCCTACTTGGCA^ 

AGTCATCQCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGC^ 

GTrrGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGT^ 

GCACCAAAATCAACGGGACTTTCCyU^TGTCGTAACAACTGCGATCGCCCGCCCM 

ACGCATU^TGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTG 

AACCGTa^GATCACTAGAAGCTTTATTGCGGTAGTTTATCAGAGTTA2kATTGCTAA 

GTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGC^^ 

TTGCT^GAAGTTGGTCGTGAGGCyVCTGGGCAGGTAAGTATaUVGGTTAC^ 

AGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACTCTTGCGTTTCTGATAGGC 

ACCTATTGGTCTTACTGACy^TCCACTTTGCCTTTCTCTCCACAGGTGTCCACTCCCAGTT 

CAATTACAGCTCTTAAGGCTAGAGTACTTAATACGACTCACTATAGGCTAGTAACGGCCG 

CCAGTGTGCTGGAATTCGGCTTATGGOVGAATCGAAQGAAGCAAGAGACCAAG 

CCTGAAAGAAGAATCTAAAGA7W3AAAAAAGAAGAAATGACTGGTGGAAAATAG^ 

GGGCCCTCTGGAAGGTGACCyVGTGQTGCAGGGTCCTCCGGCAGTCGTTACCTGAAGAAAA 

AATTCCATCACAAACATGCATCGCGAGAAGACACCTGGGACCAGGCCCAACACAACA 

ACCTAGCTlGGCGTGACCGGTGGATCAGGGGACAAATACTACAAGCy^AAGTACTCCAGGA 

ACGACTGGAATGGAGAATCAGAGGAGTACAACAGGCGGCCAAAGAGCnXSGGTGAAGT 

TCGAGGCATTTGGAGAGAGCTATATTTCCGAGAAGACa^AAGGGGAGATTTCrC^ 

GGGCGGCTATO^CGAGCaVCaU^GAACGGCTCTGGGGGGAACyAT 

TAQACCTGGAGATTCGAAGCGAAGGAGGAAACATTTATGAAGCCGAATTCTGCAGATATC 

CATCACACTGGCGGCCGCTTCCCTTTAGTGAGGGTTAATGCTTCGAGCAGACATGATAAG 

ATACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAAAAflAATGCTTTAT^ 

TGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAA 

a^a^CaLATTGCATTO^TTTTATGTTTC^^ 

TUlGCaU^TAAAACCTCTACT^TGTGGTAAAATCCGATAAGGATCGATCCGGGCTGGCGT 

T^TAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAA 

TGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGA 

CCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCG 

CCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGAT 

TTAGAGCTTTACGGCACCTCGACCGOU^AAJUICTTGATTTGGGTQATGGT^ 

GGCCATCGCCCl^TAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCA 

GTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATT 

TATAAGGGATTTTGCCGATTTCGGCCTATT6GTTAAAAAATGAGCTGATTTAACAAATAT 

TTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTCGCCTQATGCGGTATTTTCTC 

CTTACGCATCTGTGCGGTATTTO^GACCGCyiTACGCGGATCTCCGCAGCACC^ 

AAATAACCTCTGAAAGAGGAACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGT 

GGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGC 

AAAGCATGCATCTCAATTAGTCAGC?VACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGC^^ 

GCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCQCCCCTAACTC 

CGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGQCTGACTAA 

TTTTTTTTATTTATGCAGAGGCCGAG6CCGCCTCGGCCTCTGAGCTA 

GAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCaAAAAGCTTGATTCTTCTGACACTUVC^ 

TCTCGAACTTAAGGCTAGAGCCACCATGATTGAACAAGATGGATTGCACGCAGGTTCTCC 

GGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCT6CTC 

TGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTC7AGA 
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CCTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCAC 

GACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCT 

GCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAA 

AGTATCCATCATGGCTGATGCAATGCGGCGGCrrGCATACGCTTGATCCGGCT^^ 

ATTCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCT 

TGTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGC 

CAGGCTCAAGGCGCGCATGCCCGAGGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTG 

CTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCT 

GGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCC6TGATATTGCTGAAGAGCT 

TGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCA 

GCGC».TCGCCTTCTATCGCCTT(^GACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAA 

ATGACCGACCAAGCGACGCCC^CCTGCCATCACGATGGCCGCAATAAAATATCTTTAT^ 

TTCATTACATCTGTGTGTTGGTTTTTTGTGTGAATCGATAGCGATAAGGATCCGCGT^^ 

GTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCyV^^^ 

AACa.CCCGCTGACGCGCCCTGACQGGCTTGTCTGCrCCCGGCATCCGCTTACAGAC^ 

TGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGC 

GAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGT 

TTCTTAGACGTCaVGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATT 

TTTCTAAATACy^TTCAAATATGTATCCGCTCATGAGAO^TAACCCTQATAA^ 

ATAATATTCAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTT 

TTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGA 

tgctgaagatcagttgggtgcacgagtgggttacatcgaactggatctcaacagcggtaa 

gatccttgagagttttcgccccgaagaacgitttca^tgatgagcacrt^ 

gctatgtggcgcggtattatcccgtatoxaacgccgggcaagaga^ctcgqtcgccgcat 

agactattctcsigaatgacttggttgagtactcaccagtcacaga;^ 

tggCatgacagtaagagaattatgcagtgctgco^taaccatgagtgataacactg^^^ 

caacttacttctgacaacgatcggaggaccgaaggagctaaccgcttttttgcacaac^^ 

gggggatcsitgtaactcgccttgatcgttgggaaccgqagctgaatgaagccataccaaa 

CGACGAGCGTGACACCACGATGCCraTAGCy^TGGCAACAACGTTC 
TGGCGAACTACOTACTCTAGCTTCCCGGCAACS^TTAATAGACrG 

agttgcaggaccaCttctgcgctcggcccttccggctggctggtttattgctgataaatc 

TGGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCC 

ctcccgtatcgtagttatctacacgacggggagtcaggcaactatggatgaacqaaatag 

ACTVGATCGCTGAGATAGGTGCCTCACTGATTAAGO^TTGGTAACTQTCAGACCAAGTT^ 

ctcatatatactttagattgatttaaaacttcatttttaatttaa^ 

gatcctttttgataatctcatgacaiaaatcccttaacgtgagttttcgttcc^ 

gtcagaccccgtagaaaagatcmaggatcttcttgagatcctttttttctgcgcgtaat 

CTGCTGCTTGCAAACMAAAAACCACCGCTACCAGCGGTGGTTTGTTT^ 

GCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACC^^ 

CCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATA 

CCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTAC 

CGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGG 

TTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGC6 

TGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAG 

CGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCT 

TTATAGTCCTGTCGGGTTTCGCOICCTCTGACTTGAGCGTCXSATTTCT 

AGGGGGGCGGAGCCTATGGAAAAACGCCT^GCAACGCGGCCTTTTTACGGTTCCTGGCCTT 
TTGCTGGCCTTTTGCTCACATGGCTCGACAGATCT 
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SEQ ID No. l4 - pESYNREV 

TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTA 

TTGGCCT.TTGCATACGTTGTATCTATATCATAATATQTACATTTATATTGGCTCATGT 

AATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGG 

GTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCC 

GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGCC:AATAGGGACTTTCa;^TTGACGTC?VATGGGTGGAGTATTTACGGTA^ 

CCaiCTTGGCAGTACATa^GTGTATCATATGCCAAGTCCGCCCCCTATTGACGT 

CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 

GCAGTACATCTACGTATTAGTCATCGCrATTACCATGGTGATGCGGTTTTGGCAGTACAC 

CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT 

CAATGGGAGTTTGTTTTGGCACCAAAATCy^CGGGACTTTCCAA^ 

CGATCGCCCKSCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGiE^TCTATATA 

AGCAGAGCTCGrrTAGTGAACCGTCAGATai.CTAGAAGCTTTATTGCGGTAGTTTATCAC 

AGTTAAATTGCTAACGCAGTCAGTGCTTCTGACAO^CAGTCTCGAACTTAAGCTCC^ 

GACTCTCITAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAA 

GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACT 

CTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCaVCOT 

AGGTGTCCACTCCCa^GTTCAATTACaLGCTCTTAAGGCTAGAGTACTTAATACGACT 

ATAGGCTAGCCTCGAGAATTCGCCACCATGGCTGAGAGaVAGGAGGCaVGGGATOVAGAG 

ATGAACCTCAAGGAAGAGAGCAAAGAGGAGAAGCGCCGCAACGACTGGTGGAAGATCGAC 

CCACAAGGCCCCCTGGAGGGGGACCAGTGGTGCCGCGTGCTGAGACAGTCCCTGCCCGAG 

GAGAAGATTCCTAGCCAGACCTGCATCGCCAGAAGACACCTCGGCCCCGGTCCCACCCAG 

CACAO^CCCTCCAGAAGGGATAGGTGGATTAGGGGCCAGATTTTGCAAGCCGAGGTCCTC 

CaUVGAAAGGCTGGAATGGAGAATTAGGGGCGTGa^GAAGCCGCTAAA^ 

GTGAATCGCGGCATCTGGAGGGAGCTCTACTT'CCGCGAGQACCAGAGGGGCGATTTCTCC 

GCATGGGGAGGCTACCAGAGGGCACAAGAAAGGCTGTGGGGCGAGCAGAGCAGCCCCCGC 

GTCTTGAGGCCCGGAGACTCCAAAAGACGCCGCAAACACCTGTGAAGTCGACCCGGGCGG 

CCGCTTCCCTTTAGTQAGGGTTAATGCTTCGAGCAGACATGATAAGATACATTGATGAGT 

TTGGACAAACCACAACTAGAATGCAGTQAAAAAAATGCOTTATT^ 

CTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACa^TTAACA^ 

TTCATTTTATGTTTCAGGTTC^VGGGGGAGATGTGGGAGGTTTTTTAAAGOUVGTAA 

TCTACAAATGTGGTAAAATCCGATAAGGATCGATCCGGGCTGGCGTAATAGCGAAGAGGC 

CCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGACGCGCCCTGT 

AGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCC 

A6CGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCG 

TTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCGGATTTAGAGCTTTACGG 

CACCTCGACCGCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGA 

TAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTC 

CAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATT^ 

CCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAATATTTAACGCGAATTTT 

AACAAAATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTCCTTACGCATCTC 

CGGTATTTCACACCGCATACGCGGATCTGCGCAGCACCATGGCCTGAAATAACOTCTGAA 

AGAGQAACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGTGGAATGTGTGTCAG 

TTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGC^ 

AATTAGTO^GCAACCT^GGTGTGGAAAGTCCCCAGGCTCCCCy^GCAGGCAGAAGTATG^^ 

AGOVTGCyVTCTCAATTAGTCyiGCAACCATAGTCCCGCCCCTAACTCCGCC^ 

CTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATOT 

GCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTT 

GGAGGCCTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAGTCTCGAACTTAAGG 

CTAGAGCCACCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGG 

AGAGGCTATTCGGCTATGACTGGGCACSACAGACAATCGGCTGCTCTGATGCCGCC^^ 

TCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCC 

TGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTT 

GCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAG 

TGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGG 

CTGATGCAATGCGGCGGCTGCyVTACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAG 

CGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATG 

ATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGC 

GOVTGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCA 
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TGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACC 

GCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGG 

CTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCT 

ATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCT^GC 

GACGCCCAACCTGCCATCACGATGGCCGCAATAAAATATCTTTATTTXCATTACATCXGT 

GTGTTGGTTTTTT6TGTGAATCGATAGCGATAAGGATCCGCGTATGGTGCACTCTCA6TA 

CAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACG 

CGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCG 

GGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGG6CC 

TCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAG 

GTGGCACTTTTCGGGGAAATGTG " :CGGAACCCCTATTTGTTTATTTTTCTAAATACATT 

CAAATATGTATCCGCTCATGAGA>^TAACCCTGATAAATGCTTCAATAATATTGAAAAA 

GGAAGAGTATGAGTATTO^CATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTT 

GCCTTCCrGTTTTTGCTa^CCCAGAAAOKniWTGAAAGTAAAAGATGC^ 

TGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTT^^ 

TTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCG^ 

TATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGA 

ATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGAC^ 

GAQAATTATGCAGTQCTGCOVTAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGA 

CAACm-TCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACai^ 

CTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACA 

CCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTA 

CTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCAC 

TTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGC 

GTGGGTCTCGCGGTATCM^TGa^GCa^CTGGGGCCAGATGGTAAGCCCTCCCGTAT 

TTATCTAO^CGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGA 

TAGGTGCCTCACTGATTAAGCATTGGTl^CTGTCAGACCAAGTTTACT 

AGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTxrr 

ATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAG 

AAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGC^^ 

Ca^AAAAAACaVCCGCTACGAGCQGTGQTTTOTTTGCCGGAT 

TTCCGAAGGTAACTGGCTTCAGCawa^GCGCAGATACCa^TACTGT 

CGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAA 

TCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAA 

GACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGC 

CCAGCTTGGAGCGAACQACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAA 

GCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAA 

aVGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATGTTTATAGTCCTGTCG 

GGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCC 

TATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTG 

CTCACATGGCTCGACAGATCT 
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Seq ID No: 15 codon optimised HIV ^ag-pol 

ATGGGCGCCCGCGCCAGCGTGCTGTCGGGCGGCGAGCTGGACCGCTGGGAGAAGATCCGC 

CTGCGCCCCGGCGGCAAAAAGAAGTACAAGCTGAAGCACATCGTGTGGGCCAGCCGCGAA 

CTGGAGCGCTTCGCCGTGAACCCCGGGCTCCTGGAGACCAGCGAGGGGTGCCGCCAGATC 

CTCGGCCAACTGCAGCCCAGCCTGCAAACCGGCAGCGAGGAGCTGCGCAGCCTGTACAAC 

ACCGTGGCCACGCTGTACTGCGTCCACCAGCGCATCGAAATCAAGGATACGAAAGAGGCC 

CTGGATAAAATCGAAGAGGAACA6AATAAGAGCAAAAAGAAGGCCCAACAGGCCGCCGCG 

GACACCGGACACAGCAACCAGGTCAGCCAGAACTACCCCATCGTGCAGAACATCCAGGGG 

CAGATGGTGCACCAGGCCATCTCCCCCCGCACGCTGAACGCCTGGGTGAAGGTGGTGGAA 

GAGAAGGCTTTTAGCCCGGAGGTGATACCCATGTTCTCAGCCCTGTCAGAGGGAGCCACC 

CCCCAAGATCTGAACACCATGCTCAACACAGTGGGGGGACACCAGGCCGCCATGCAGATG 

CTGAAGGAGACCATCAATGAGGAGGCTGCCGAATGGGATCGTGTGCATCCGGTGCACGCA 

GGGCCCATCGCACCGGGCCAGATGCGTGAGCCACGGGGCTCAGACATCGCCGGAACGACT 

AGTACCCTTCAGGAACAGATCGGCTGGATGACCAACAACCCACCCATCCCGGTGGGAGAA 

ATCTACAAACGCTGGATCATCCTGGGCCTGAACAAGATCGTGCGCATGTATAGCCCTACC 

AGCATCCTGGACATCCGCCAAGGCCCGAAGGAACCCTTTCGCGACTACGTGGACCGGTTC 

TACAAAACGCTCCGCGCCGAGCAGGCTAGCCAGGAGGTGAAGAACTGGATGACCGAAACC 

CTGCTGGTCCAGAACGCGAACCCGGACTGCAAGACGATCCTGAAGGCCCTGGGCCCAGCG 

GCTACCCTAGAGGAAATGATGACCGCCTGTCAGGGAGTGGGCGGACCCGGCCACAAGGCA 

CGCGTCCTGGCTGAGGCCATGAGCCAGGTGACCAACTCCGCTACCATCATGATGCAGCGC 

GGCAACTTTCGGAACCAACGCAAGATCGTCAAGTGCTTCAACTGTGGCAAAGAAGGGCAC 

ACAGCCCGCAACTGCAGGGCCCCTAGGAAAAAGGGCTGTTGGAAATGTGGAAAGGAAGGA 

CACCAAATGAAAGATTGTACTGAGAGACAGGCTAATTTTTTAGGGAAGATCTGGCCTTCC 

CACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAGCCAACAGCCCCACCAGAA 

GAGAGCTTCAGGTTTGGGGAAGAGACAACAACTCCCTCTCAGAAGCAGGAGCCGATAGAC 

AAGGAACTGTATCCTTTAGCTTCCCTCAGATCACTCTTTGGCAGCGACCCCTCGTCACAA 

TAAAGATAGGGGGGCAGCTCAAGGAGGCTCTCCTGGACACCGGAGCAGACGACACCGTGC 

TGGAGGAGATGTCGTTGCCAGGCCGCTG6AAGCCGAAGATGATCGGGGGAATCGGCGGTT 

TCATOUVGGTGCGCCAGTATGACCAGATCCTCATCGAAATCTGCGGCCACAAGGCTATCG 

GTACCGTGCTGGTGGGCCeCACACCCGTCAACATCATCGGACGCAACCTGTTGACGCAGA 

TCGGTTGCACGCTGAACTTCCCCATTAGCCCTATCGAGACGGTACCGGTGAAGCTGAAGC 

CCGGGATGGACGGCCCGAAGGTCAAGCAATGGCCATTGACAGAGGAGAAGATCAAGGCAC 

TGGTGGAGATTTGCACAGAGATGGAAAAGGAAGGGAAAATCTCCAAGATTGGGCCTGAGA 

ACCCGTACAACACGCCGGTGTTCGCAATCAAGAAGAAGGACTCGACGAAATGGCGCAAGC 

TGGTGGACTTCCGCGAGCTGAACAAGCGCACGCAAGACTTCTGGGAGGTTCAGCTGGGCA 

TCCCGCACCCCGCAGGGCTGAAGAAGAAGAAATCCGTGACCGTACTGGATGTGGGTGATG 

CCTACTTCTCCGTTCCCCTGGACGAAGACTTCAGGAAGTACACTGCCTTCACAATCCCTT 

CGATCAACAACGAGACACCGGGGATTCGATATCAGTACAACGTGCTGCCCCAGGGCTGGA 

AAGGCTCTCCCGCAATCTTCCAGAGTAGCATGACCAAAATCCTGGAGCCTTTCCGCAAAC 

AGAACCCCGACATCGTCATCTATCAGTACATGGATGACTTGTACGTGGGCTCTGATCTAG 

AGATAGGGCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACCTGTTGAGGTGGGGAC 

TGACCACACCCGACAAGAAGCACCAGAAGGAGCCTCCCTTCCTCTGGATGGGTTACGAGC 

TGCACCCTGACAAATGGACCGTGCAGCCTATCGTGCTGCCAGAGAAAGACAGCT6GACTG 

TCAACGACATACAGAAGCTGGTGGGGAAGTTGAACTGGGCCAGTCAGATTTACCCAGGGA 

TTAAGGTGAGGCAGCTGTGCAAACTCCTCCGCGGAACCAAGGCACTCACAGAGGTGATCC 

CCCTAACCGAGGAGGCCGAGCTCGAACTGGCAGAAAACCGA6A6ATCCTAAAGGAGCCC6 

TGCACGGCGTGTACTATGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGGC 

AAGGCCAGTGGACCTATCAGATTTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGT 

ACGCCCGGATGAGGGGTGCCCACACTAACGACGTCAAGCAGCTGACCGAGGCCGTGCAGA 

AGATCACCACCGAAAGCATCGTGATCTGGGGAAAGACTCCTAAGTTCAAGCTGCCCATCC 

AGAAGGAAACCTGGGAAACCTGGTGGACAGAGTATTGGCAGGCCACCTGGATTCCTGAGT 

GGGAGTTCGTCAACACCCCTCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCA 

TAGTGGGCGCCGAAACCTTCTACGTGGATGGGGCCGCTAACAGGGAGACTAAGCTGGGCA 

AAGCCGGATACGTCACTAACCGGGGCAGACAGAAGGTTGTCACCCTCACTGACACCACCA 

ACCAGAAGACTGAGCTGCAGGCCATTTACCTCGCTTTGCAGGACTCGGGCCTGGAGGTGA 

ACATCGTGACAGACTCTCAGTATGCCCTGGGCATCATTCAAGCCCAGCCAGACCAGAGTG 

AGTCCGAGCTGGTCAATCAGATCATCGAGCAGCTGATCAAGAAGGAAAAGGTCTATCTGG 

CCTGGGTACCCGCCCACAAAGGCATTGGCGGCAATGAGCAGGTCGACAAGCTGGTCTCGG 

CTGGCATCAGGAAGGTGCTATTCCTGGATGGCATCGACAAGGCCCAGGACGAGCACGAGA 

AATACCACAGCAACTGGCGGGCCATGGCTAGCGACTTCAACCTGCCCCCTGTGGTGGCCA 

AAGAGATCGTGGCCAGCTGTGACAAGTGTCAGCTCAAGGGCGAAGCCATGCATGGCCAGG 

TGGACTGTAGCCCCGGCATCTGGCAACTCGATTGCACCCATCTGGAGGGCAAGGTTATCC 

TGGTAGCCGTCCATGTGGCCAGTGGCTACATCGAGGCCGAGGTCATTCCCGCC6AAACAG 

GGCAGGAGACAGCCTACTTCCTCCTGAAGCTGGCAGGCCGGTGGCCAGTGAAGACCATCC 

ATACTGACAATGGCAGCAATTTCACCAGTGCTACGGTTAAGGCCGCCTGCTGGTGGGCGG 

GAATCAAGCAGGAGTTCGGGATCCCCTACAATCCCCAGAGTCAGGGCGTCGTCGAGTCTA 

TGAATAAGGAGTTAAAGAAGATTATCGGCCAGGTCAGAGATCAGGCTGAGCATCTCAAGA 

CCGCGGTCCAAATGGCGGTATTCATCCACAATTTCAAGCGGAAGGGGGGGATTGGGGGGT 

ACAGTGCGGGGGAGCGGATCGTGGACATCATCGCGACCGACATCCAGACTAAGGAGCTGC 

AAAAGCAGATTACCAAGATTCAGAATTTCCGGGTCTACTACAGGGACAGCAGAAATCCCC 

TCTGGAAAGGCCCAGCGAAGCTCCTCTGGAAGGGTGAGGGGGCAGTAGTGATCCAGGATA 

ATAGCGACATCAAGGTGGTGCCCAGAAGAAAGGCGAAGATCATTAGGGATTATGGCAAAC 

AGATGGCGGGTGATGATTGCGTGGCGAGCAGACAGGATGAGGATTAG 
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Seq ID No: 16 codon optimised EIAV gag-poI 

ATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAGAAACTGGAAAAAGTCACCGTTCAG 

GGTAGCCAAAAGCTTACCACAGGCAATTGCAACTGGGCATTGTCCCTGGTGGATCTTTTC 

CACGACACTAATTTCGTTAAGGAGAAAGATTGGCAACTCAGAGACGTGATCCCCCTCTTG 

GAGGACGTGACCCAAACATTGTCTGGGCAGGAGCGCGAAGCTTTCGAGCGCACCTGGTGG 

GCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAACAACGTGGTTGACGGTAAAGCTAGC 

TTTCAACTGCTCCGCGCTAAGTACGAGAAGAAAACCGCCAACAAGAAACAATCCGAACCT 

■AGCGAGGAGTACCCAATTATGATCGACGGCGCCGGCAATAGGAACTTCCGCCCACTGACT 

CCCAGGGGCTATACCACCTGGGTCAACACCATCCAGACAAACGGACTTTTGAACGAAGCC 

TCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGCACCTCCGAAGAAATGAATGCTTTT 

CTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAGATCCTGCTCGATGICCATTGACAAG 

ATCGCCGACGACTGGGATAATCGCCACCCCCTGCCAAACGCCCCTCTGGTGGCTCCCCCA 

CAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGGACTGGGGGTGCCCCGCGAACGC 

CAGATGGAGCCAGCATTTGRCCAATTTAGGCAGACCTACAGACAGTGGATCATCGAAGCC 

ATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCAAGGCACAGAACATCAGGCAGGGG 

GCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTTCTGTCCCAGATTAAATCCGAAGGC 

CACCCTCAGGAGATCTCCAAGTTCTTGACAGACACACTGACTATCCAAAATGCAAATGAA 

GA6TGCAGAAACGCCATGAGGCACCTCAGACCTGAAGATACCCTGGAGGAGAAAATGTAC 

GCATGTCGCGACATTGGCACTACCAAGCAAAAGATGATGCTGCTCGCCAAGGCTCTGCAA 

ACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTGAAGGGAGGTCCATTGAAAGCTGCA 

CAAACATGTTATAATTGTGGGAA6CCAGGACATTTATCTAGTCAATGTAGAGCACCTAAA 

GTCTGTTTTAAATGTAAACAGCCTGGA(»TTTCTCAAAGauVTGCMAAGTGTTCCAAAA 

AACGGGAAGCAAGGGGCTCAAGGGAGGCCCCAGAAACAAACTTTCCCGATACAACAGAAG 

AGTCAGCACAACAAATCTGTTGTACAAGAGACTCCTCAGACTCAAAATCTGTACCCAGAT 

CTGAGCGAAATAAAAAAGGAATACAATGTCAAGGAGAAGGATCAAGTAGAGGATCTCAAC 

CTGGACAGTTTGTGGGAGTAACATACAATCTCGAGAAGAGGCCCACTACCATCGTCCTGA 

TCAATGACACCCCTCTTAATGTGCTGCTGGACACCGGAGCCGACACCAGCGTTCTCACTA 

CTGCTCACTATAACAGACTGAAATACAGAGGAAGGAAATACCAGGGCACAGGCATCATCG 

GCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTGTCACCATCAAAAAGAAGGGGAGAC 

ACATTAAAACCAGAATGCTGGTCGCCGACATCCCCGTCACCATCCTTGGCAGAGACATTC 

TCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTGTCTAAGGAAATCAAGTTCCGCA 

AGATCGAGCrGAAAGAGGGCACAATGGGTCCAAAAATCCCCCAGTGGCCCCTGACCAAAG 

AGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCCTGCTTTCTGAGGGCAAGATTAGCG 

AGGCCAGCGACAATAACCCTTACAACAGCCCCATCTTTGTGATTAAGAAAAGGAGCGGCA 

AATGGAGACTCCTGCAGGACCTGAGGGAACTCAACAAGACCGTCCAGGTCGGAACTGAGA 

TCTCTCGCGGACTGCCTCACCCCGGCGGCCTGAaiTAAATGCAAGCaCATGACAGTCCOT^ 

ACATTGGAGACGCTTATTTTACCATCCCCCTCGATCCTGAATTTCGCCCCTATACTGCTT 

TTACCATCCCCAGCATCAATCACCAGGASCCC6ATAAACGCTATGTGTGGAAGTGCCTCC 

CCCAGGGATTTGTGCTTAGCCCCTACATTTACCA6AAGACACTTCAAGAGATCCTCCAAC 

CTTTCCGCGAAAGATACCCAGAGGTTCAACTCTACCAATATATGGACGACCTGTTCATGG 

GGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCATCATCGAACTGAGGGCAATCCTCC 

TGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAAGAAGTTCCTCCATATAGCTGGC 

TGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTCCAGAAGATGCAGTTGGATATGGTCA 

AGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGCAATATTACCTGGATGAGCTCCG 

GAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACTACAAAAGGATGCCTGGAGTTGA 

ACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGGAACTGGAGGAGAATAATGAAAAGA 

TTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAAGAAGAAATGTTGTGCGAGGTCG 

AAATCACTAAGAACTACGAAGCCACCTATGTCATCAAACAGTCCCAAGGCATCTTGTGGG 

CCG6AAAGAAAATCATGAAGGCCAACAAAGGCTGGTCCACCGTTAAAAATCTGATGCTCC 

TGCTCCAGCACGTCGCCACCGAGTCTATCACCCGCGTCGGCAAGTGCCCCACCTTCAAAG 

TTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGCAAAAAGGCTGGTACTACTCTTGGC 

TTCCCGAGATCGTCTACACCCACCAAGTGGTGCACGACGACTGGAGAATGAAGCTTGTCG 

AGGAGCCCACTAGCGGAATTACAATCTATACCGACGGCGGAAAGCAAAACGGAGAGGGAA 

TCGCTGCATACGTCACATCTAACGGCCGCACCAAGCAAAAGAGGCTCGGCCCTGTCACTC 

ACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCCTTGAGGACACTAGAGACAAGCAGG 

TGAACATTGTGACTGACAGCTACTACTGCTGGAAAAACATCACAGAG6GCCTTGGCCTGG 

AGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGAATATCCGCGAAAAGGAAATTGTCT 

ATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACGGCAACCAACTCGCCGATGAAGCCG 

CCAAAATTAAAGAGGAAATCATGCTTGCCTACCAGGGCACACAGATTAAGGAGAAGAGAG 

ACGAGGACGCTGGCTTTGACCTGTGTGTGCCATACGACATCAT6ATTCCCGTTAGCGACA 

CAAAGATCATTCCAACCGATGTCAAGATCCAGGTGCCACCCAATTCATTTGGTTGGGTGA 

CCGGAAAGTCCAGCATGGCTAAGCAGGGTCTTCTGATTAACGGGGGAATCATTGATGAA6 

GATACACCGGCGAAATCCAGGTGATCTGCACAAATATCGGCAAAAGCAATATTAAGCTTA 

TCGAAGGGCAGAAGTTCGCTCAACTCATCATCCTCCAGCACCACAGCAATTCAAGACAAC 

CTTGGGACGAAAACAAGATTAGCCAGAGAGGTGACAAGGGCTTCGGCAGCACAGGTGTGT 

TCTGGGTGGAGAACATCCAGGAAGCACAGGACGAGCACGAGAATTGGCACACCTCCCCTA 

AGATTTTGGCCCGCAATTACAAGATCCCACTGACTGTGGCTAAGCAGATCACACAGGAAT 

GCCCCCACTGCACCAAACAAGGTTCTGGCCCCGCCGGCTGCGTGATGAGGTCCCCCAATC 

ACTGGCAGGCAGATTGCACCCACCTCGACAACAAAATTATCCTGACCTTCGTGGAGAGCA 

ATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAAAATGCATTGTGCACCTCCCTCG 
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CAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAATCCCTGCACACCGACAACGGCACCA 
ACTTTGTGGCTGAACCTGTGGTGAATCTGCTGAAGTTCCTGi\AAATCGCCCACACCACTG 
GCATTCCCTATCACCCTGAAAGCCAGGGCATTGTCGAGAGGGCCAACAGAACTCTGAAAG 
AAAAGATCCAATCTCACAGAGACAATACACAGACATTGGAGGCCGCACTTCAGCTCGCCC 
TTATCACCTGCAACAAAGGAAGAGAAAGCATGGGCGGCCAGACCCCCTGGGAGGTCTTCA 
TCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGCTCTTGCAGCAGGCCCAGTCCTCCA 
AAAAGTTCTGCTTTTATAAGATCCCCGGTGAGCACGACTGGAAAGGTCCTACAAGAGTTT 
TGTGGAAAGGAGACGGCGCAGTTGTGGTGAACGATGAGGGCAAGGGGATCATC6CTGTGC 
CCCTGACACGCACCAAGCTTCTCATCftAGCCftAACTGA 
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SEQ ID NO: 17 



pIRESIhyg ESYNGP 



AATTCGCCACCATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAGAAACTGGAAAAAG 

TCACCGTTCAGGGTAGCCAAAAGCTTACCACAGGCAATTGCAACTGGGCATTGTCCCTGG 

TGGATCTTTTCCACGACACTAATTTCGTTAAGGAGAAAGATTGGCAACTCAGAGACGTGA 

TCCCCCTCTTGGAGGACGTGACCCAAACATTGTCTGGGCAGGAGCGCGAAGCTTTCGAGC 

GCACCTGGTGGGCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAACAACGTGGTTGACG 

GTAAAGCTA6CTTTCAACTGCTCCGCGCTAAGTACGAGAAGAAAACCGCCAACAAGAAAC 

AATCCGAACCTAGCGAGGAGTACCCAATTATGATCGACGGCGCCGGCAATAGGAACTTCC 

GCCCACTGACTCCCAGGGGCTATACCACCTGGGTCAACACCATCCAGACAAACGGACTTT 

TGAACGAAGCCTCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGCACCTCCGAAGAAA 

TGAATGCTTTTCTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAGATCCTGCTCGATG 

CCATTGACAAGATCGCCGACGACTGGGATAATCGCCACCCCCTGCCAAACGCCCCTCTGG 

TGGCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGGGGACTGGGGGTGC 

CCCGCGAACGCCAGATGGAGCCAGCATTTGACCAATTTAGGCAGACCTACAGACAGTGGA 

TCATCGAAGCCATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCCAAGGCACAGAACA 

TCAGGCAGGGGGCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTTCTGTCCCAGATTA 

AATCCGAAGGCCACCCTCAGGAGATCTCCAAGTTCTTGACAGACACACTGACTATCCAAA 

ATGCAAATGAAGAGTGCAGAAACGCCATGAGGCACCTCAGACCTGAAGATACCCTGGAGG 

AGAAAATGTACGCATGTCGCGACATTGGCACTACCAAGCAAAAGATGATGCTGCTCGCCA 

AGGCTCTGCAAACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTGAAGGGAGGTCCAT 

TGAAAGCTGCACAAACATGTTATAATTGTGGGAAGCCAGGACATTTATCTAGTCAATGTA 

GAGCACCTAAAGTCTGTTTTAAATGTAAACAGCCTGGACATTTCTCAAAGCAATGCAGAA 

GTGTTCCAAAAAACGGGAAGCAAGGGGCTCAAGGGAGGCCCCAGAAACAAACTTTCCCGA 

TACAACAGAAGAGTCAGCACAACAAATCTGTTGTACAAGAGACTCCTCAGACTCAAAATC 

TGTACCCAGATCTGAGCGAAATAAAAAAGGAATACAATGTCAAGGAGAAGGATCAAGTAG 

AGGATCTCAACCTGGACAGTTTGTGGGAGTAACATACAATCTCGAGAAGAGGCCCACTAC 

CATCGTCCTGATCAATGACACCCCTCTTAATGTGCTGCTGGACACCGGAGCCGACACCAG 

CGTTCTCACTACTGCTCACTATAACAGACTGAAATACAGAGGAAGGAAATACCAGGGCAC 

AGGCATCATCGGCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTGTCACCATCAAAAA 

GAAGGGGAGACACATTAAAACCAGAATGCTGGTCGCCGACATCCCCGTCACCATCCTTGG 

CAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAACTGTCTAAGGAAAT 

CAAGTTCCGCAAGATCGAGCTGAAAGAGGGCACAATGGGTCCAAAAATCCCCCAGTGGCC 

CCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGT^TCGTGCAGCGCCTGCTTTCTGAGGG 

CAAGATTAGCGAGGCCAGCGACAATAACCCTTACAACAGCCCCATCTTTGTGATT7VAGAA 

AAGGAGCGGCAAATGGAGACTCCTGCAGGACCTGAGGGAACTCAACAAGACCGTCCAGGT 

CGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTAAATGCAAGCACAT 

GACAGTCCTTGACATTGGAGACGCTTATTTTACCATCCCCCTCGATCCTGAATTTCGCCC 

CTATACrGCTTTTACCATCCCCAGCATCAATCACCAGGAGCCCGATAAACGCTATGTGTG 

GAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGAAGACACTTCAAGA 

GATCCTCCAACCTTTCCGCGAAAGATACCCAGAGGTTCAACTCTACCAATATATGGACGA 

CCTGTTCATGGGGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCATCATCGAACTGAG 

GGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGCAAGAAGTTCCTCC 

ATATAGCTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGTUW^GTCCAGAAGATGCAGTT 

GGATATGGTCAAGAACCCAACACTGAACGACGTCCAGAAGCTCATGGGCAATATTACCTG 

GATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAACTACAAAAGGATG 

CCTGGAGTTGAACCAGAAGGTCATTTGGACAGAGG7VAGCTCAGAAGGAACTGGAGGAGAA 

TAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCGAAGAAGAAATGTT 

GTGCGAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCAAACAGTCCCAAGG 

CATCTTGTGGGCCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGTCCACCGTTAAAAA 

TCTGATGCTCCTGCTCCAGCACGTCGCCACCGAGTCTATCACCCGCGTCGGCAAGTGCCC 

CACCTTCAAAGTTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGCAAAAAGGCTGGTA 

CTACTCTTGGCTTCCCGAGATCGTCTACACCCACCAAGTGGTGCACGACGACTGGAGAAT 

GAAGCTTGTCGAGGAGCCCACTAGCGGAATTACAATCTATACCGACGGCGGAAAGCAAAA 

CGGAGAGGGAATCGCTGCATACGTCACATCTAACGGCCGCACCAAGCAAAAGAGGCTCGG 

CCCTGTCACTCACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCCTTGAGGACACTAG 
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AGACAAGCAGGTGAACATTGTGACTGACAGCTACTACTGCTGGAAAAACATCACAGAGGG 

CCTTGGCCTGGAGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGAATATCCGCGAAAA 

GGAAATTGTCTATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACGGCAACCAACTCGC 

CGATGAAGCCGCCAAAATTAAAGAGGAAATCATGCTTGCCTACCAGGGCACACAGATTAA 

GGAGAAGAGAGACGAGGACGCTGGCTTTGACCTGTGTGTGCCATACGACATCATGATTCC 

CGTTAGCGACACAAAGATCATTCCAACCGATGTCAAGATCCAGGTGCCACCCAATTCATT 

TGGTTGGGTGACCGGAAAGTCCAGCATGGCTAAGCAGGGTCTTCTGATTAACGGGGGAAT 

CATTGATGAAGGATACACCGGCGAAATCCAGGTGATCTGCACAAATATCGGCAAAAGCAA 

TATTAAGCTTATCGAAGGGCAGAAGTTCGCTCAACTCATCATCCTCCAGCACCACAGCAA 

TTCAAGACAACCTTGGGACGAAAACAAGATTAGCCAGAGAGGTGACAAGGGCTTCGGCAG 

CACAGGTGTGTTCTGGGTGGAGAACATCCAGGAAGCACAGGACGAGCACGAGAATTGGCA 

CACCTCCCCTAAGATTTTGGCCCGCAATTACAAGATCCCACTGACTGTGGCTAAGCAGAT 

CACACAGGAATGCCCCCACTGCACCAAACAAGGTTCTGGCCCCGCCGGCTGCGTGATGAG 

GTCCCCCAATCACTGGCAGGCAGATTGCACCCACCTCGACAACAAAATTATCCTGACCTT 

CGTGGAGAGCAATTCCGGCTACATCCACGCAACACTCCTCTCCAAGGAAAATGCATTGTG 

CACCTCCCTCGCAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAATCCCTGCACACCGA 

CAACGGCACCAACTTTGTGGCTGAACCTGTGGTGAATCTGCTGAAGTTCCTGAAAATCGC 

CCACACCACTGGCATTCCCTATCACCCTGAT^GCCAGGGCATTGTCGAGAGGGCCAACAG 

AACTCTGAAAGAAAAGATCCAATCTCACAGAGACAATACACAGACATTGGAGGCCGCACT 

TCAGCTCGCCCTTATCACCTGCAACAAAGGAAGAGAAAGCATGGGCGGCCAGACCCCCTG 

GGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAA7\AGCTGCTCTTGCAGCAGGC 

CCAGTCCTCCAAAAAGTTCTGCTTTTATAAGATCCCCGGTGAGCACGACTGGAAAGGTCC 

TACAAGAGTTTTGTGGAAAGGAGACGGCGCAGTTGTGGTGAACGATGAGGGCAAGGGGAT 

CATCGCTGTGCCCCTGACACGCACCAAGCTTCTCATCAAGCCAAACTGAACCCGGGGCGG 

CCGCACTAGAGGAATTCGCCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCG 

CTTGGAATAAGGCCGGTGTGTGTTTGTCTATATGTGATTTTCCACCATATTGCCGTCTTT 

TGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGGGGTCT 

TTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCT 

GGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCC 

ACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGC 

GGCACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTC 

CTCAAGCGTAGTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGAAT 

CTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAGCTCT 

AGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGA7\AAACACGATGATAAGCTTGCCA 

CAACCCCGTACCAAAGATGGATAGATCCGGAAAGCCTGAACTCACCGCGACGTCTGTCGA 

GAAGTTTCTGATCGAAAAGTTCGACAGCGTCTCCGACCTGATGCAGCTCTCGGAGGGCGA 

AGAATCTCGTGCTTTCAGCTTCGATGTAGGAGGGCGTGGATATGTCCTGCGGGTAAATAG 

CTGCGCCGATGGTTTCTACAAAGATCGTTATGTTTATCGGCACTTTGCATCGGCCGCGCT 

CCCGATTCCGGAAGTGCTTGACATTGGGGAATTCAGCGAGAGCCTGACCTATTGCATCTC 

CCGCCGTGCACAGGGTGTCACGTTGCAAGACCTGCCTGAAACCGAACTGCCCGCTGTTCT 

GCAGCCGGTCGCGGAGGCCATGGATGCGATCGCTGCGGCCGATCTTAGCCAGACGAGCGG 

GTTCGGCCCATTCGGACCGCAAGGAATCGGTCAATACACTACATGGCGTGATTTCATATG 

CGCGATTGCTGATCCCCATGTGTATCACTGGCAAACTGTGATGGACGACACCGTCAGTGC 

GTCCGTCGCGCAGGCTCTCGATGAGCTGATGCTTTGGGCCGAGGACTGCCCCGAAGTCCG 

GCACCTCGTGCACGCGGATTTCGGCTCCAACAATGTCCTGACGGACAATGGCCGCATAAC 

AGCGGTCATTGACTGGAGCGAGGCGATGTTCGGGGATTCCCAATACGAGGTCGCCAACAT 

CTTCTTCTGGAGGCCGTGGTTGGCTTGTATGGAGCAGCAGACGCGCTACTTCGAGCGGAG 

GCATCCGGAGCTTGCAGGATCGCCGCGGCTCCGGGCGTATATGCTCCGCATTGGTCTTGA 

CCAACTCTATCAGAGCTTGGTTGACGGCAATTTCGATGATGCAGCTTGGGCGCAGGGTCG 

ATGCGACGCAATCGTCCGATCCGGAGCCGGGACTGTCGGGCGTACACAAATCGCCCGCAG 

AAGCGCGGCCGTCTGGACCGATGGCTGTGTAGAAGTACTCGCCGATAGTGGAAACCGACG 

CCCCAGCACTCGTCCGAGGGCAAAGGAATAGAGTAGATGCCGACCGAACAAGAGCTGATT 

TCGAGAACGCCTCAGCCAGCAACTCGCGCGAGCCTAGCAAGGCAAATGCGAGAGAACGGC 

CTTACGCTTGGTGGCACAGTTCTCGTCCACAGTTCGCTAAGCTCGCTCGGCTGGGTCGCG 

GGAGGGCCGGTCGCAGTGATTCAGGCCCTTCTGGATTGTGTTGGTCCCCAGGGCACGATT 

GTCATGCCCACGCACTCGGGTGATCTGACTGATCCCGCAGATTGGAGATCGCCGCCCGTG 

CCTGCCGATTGGGTGCAGATCTAGAGCTCGCTGATCAGCCTCGACTGTGCCTCTAGTTGC 

CAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCC 

ACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCT 

ATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGG 

CATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCAGCTGGGGCTCG 
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AGTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTATACC 
GTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTG 
TTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGG 
TGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTC 
GGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTT 
GCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCT 
GCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGA 
T7VACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGC 
CGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACG 
CTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGG 
AAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTT 
TCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGT 
GTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTG 
CGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACT 
GGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTT 
CTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCT 
GCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAAC7VAACCAC 
CGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATC 
TCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACG 
TTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTA 
AAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCA 
ATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGC 
CTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGC 
TGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCC 
AGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTAT 
TAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGT 
TGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTC 
CGGTTCCCAACGATCAAGGCGA6TTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAG 
CTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGT 
TATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGAC 
TGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTG 
CCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCAT 
TGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTC 
GATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTC 
TGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAA 
ATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTG 
TCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGGG 
CACATTTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCTCCCGATCCCCTA 
TGGTCGACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATCTGCTCCCT 
GCTTGTGTGTTGGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAA 
GGCTTGACCGACAATTGCATGAAGAATCTGCTTAGGGTTAGGCGTTTTGCGCTGCTTCGC 
GATGTACGGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCA 
ATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTA 
AATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTAT 
GTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGACTATTTACGG 
TAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGAC 
GTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTT 
CCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGG 
CAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCC 
ATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGT 
AACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 
AGCAGAGCTCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTAATACG 
ACTCACTATAGGGAGACCCAAGCTTGGTACCGAGCTCGGATCCACTAGTAACGGCCGCCA 
GTGTGCTGGAATTAATTCGCTGTCTGCGAGGGCCAGCTGTTGGGGTGAGTACTCCCTCTC 
AAAAGCGGGCATGACTTCTGCGCTAAGATTGTCAGTTTCCAAAAACGAGGAGGATTTGAT 
ATTCACCTGGCCCGCGGTGATGCCTTTGAGGGTGGCCGCGTCCATCTGGTCAGAAAAGAC 
AATCTTTTTGTTGTCAAGCTTGAGGTGTGGCAGGCTTGAGATCTGGCCATACACTTGAGT 
GACAATGACATCCACTTTGCCTTTCTCTCCACAGGTGTCCACTCCCAGGTCCAACTGCAG 
GTCGATCGAGCA 
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SEQ ID NO: 18 



pESDSYNGP 



TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTA 

TTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCC 

AATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGG 

GTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCC 

GCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT 

AGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGC 

CCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGA 

CGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTG 

GCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACAC 

CAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT 

CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTG 

CGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATA 

AGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGC6GTAGTTTATCAC 

AGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGCAGT 

GACTCTCTTAAGGTAGCCTTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAA 

GGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACT 

CTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCAC 

AGGTGTCCACTCCCAGTTCAATTACAGCTCTTAAGGCTAGAGTACTTAATACGACTCACT 

ATAGGCTAGAGAATTCCAGGTAAGATGGGCGATCCCCTCACCTGGTCCAAAGCCCTGAAG 

AAACTGGAAAAAGTCACCGTTCAGGGTAGCCAAAAGCTTACCACAGGCAATTGCAACTGG 

GCATTGTCCCTGGTGGATCTTTTCCACGACACTAATTTCGTTAAGGAGAAAGATTGGCAA 

CTCAGAGACGTGATCCCCCTCTTGGAGGACGTGACCCAAACATTGTCTGGGCAGGAGCGC 

GAAGCTTTCGAGCGCACCTGGTGGGCCATCAGCGCAGTCAAAATGGGGCTGCAAATCAAC 

AACGTGGTTGACGGTAAAGCTAGCTTTCAACTGCTCCGCGCTAAGTACGAGAAGAAAACC 

GCCAACAAGAAACAATCCGAACCTAGCGAGGAGTACCCAATTATGATCGACGGCGCCGGC 

AATAGGAACTTCCGCCCACTGACTCCCAGGGGCTATACCACCTGGGTCAACACCArCCAG 

ACAAACGGACTTTTGAACGAAGCCTCCCAGAACCTGTTCGGCATCCTGTCTGTGGACTGC 

ACCTCCGAAGAAATGAATGCTTTTCTCGACGTGGTGCCAGGACAGGCTGGACAGAAACAG 

ATCCTGCTCGATGCCATTGACAAGATCGCCGACGACTGGGATAATCGCCACCCCCTGCCA 

AACGCCCCTCTGGTGGCTCCCCCACAGGGGCCTATCCCTATGACCGCTAGGTTCATTAGG 

GGACTGGGGGTGCCCCGCGAACGCCAGATGGAGCCAGCATTTGACCAATTTAGGCAGACC 

TACAGACAGTGGATCATCGAAGCCATGAGCGAGGGGATTAAAGTCATGATCGGAAAGCCC 

AAGGCACAGAACATCAGGCAGGGGGCCAAGGAACCATACCCTGAGTTTGTCGACAGGCTT 

CTGTCCCAGATTAAATCCGAAGGCCACCCTCAGGAGATCTCCAAGTTCTTGACAGACACA 

CTGACTATCCAAAATGCAAATGAAGAGTGCAGAAACGCCATGAGGCACCTCAGACCTGAA 

GATACCCTGGAGGAGAAAATGTACGCATGTCGCGACATTGGCACTACCAAGCAAAAGATG 

ATGCTGCTCGCCAAGGCTCTGCAAACCGGCCTGGCTGGTCCATTCAAAGGAGGAGCACTG 

AAGGGAGGTCCATTGAAAGCTGCACAAACATGTTATAATTGTGGGAAGCCAGGACATTTA 

TCTAGTCAATGTAGAGCACCTAAAGTCTGTTTTAAATGTAAACAGCCTGGACATTTCTCA 

AAGCAATGCAGAAGTGTTCCAAAAAACGGGAAGCAAGGGGCTCAAGGGAGGCCCCAGAAA 

CAAACTTTCCCGATACAACAGAAGAGTCAGCACAACAAATCTGTTGTACAAGAGACTCCT 

CAGACTCAAAATCTGTACCCAGATCTGAGCGAAATAAAAAAGGAATACAATGTCAAGGAG 

AAGGATCAAGTAGAGGATCTCAACCTGGACAGTTTGTGGGAGTAACATACAATCTCGAGA 

AGAGGCCCACTACCATCGTCCTGATCAATGACACCCCTCTTAATGTGCTGCTGGACACCG 

GAGCCGACACCAGCGTTCTCACTACTGCTCACTATAACAGACTGAAATACAGAGGAAGGA 

AATACCAGGGCACAGGCATCATCGGCGTTGGAGGCAACGTCGAAACCTTTTCCACTCCTG 

TCACCATCAAAAAGAAGGGGAGACACATTAAAACCAGAATGCTGGTCGCCGACATCCCCG 

TCACCATCCTTGGCAGAGACATTCTCCAGGACCTGGGCGCTAAACTCGTGCTGGCACAAC 

TGTCTAAGGAAATCAAGTTCCGCAAGATCGAGCTGAAAGAGGGCACAATGGGTCCAAAAA 

TCCCCCAGTGGCCCCTGACCAAAGAGAAGCTTGAGGGCGCTAAGGAAATCGTGCAGCGCC 

TGCTTTCTGAGGGCAAGATTAGCGAGGCCAGCGACAATAACCCTTACAACAGCCCCATCT 

TTGTGATTAAGAAAAGGAGCGGCAAATGGAGACTCCTGCAGGACCTGAGGGAACTCAACA 

AGACCGTCCAGGTCGGAACTGAGATCTCTCGCGGACTGCCTCACCCCGGCGGCCTGATTA 

AATGCAAGCACATGACAGTCCTTGACATTGGAGACGCTTATTTTACCATCCCCCTCGATC 
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CTGAATTTCGCCCCTATACTGCTTTTACCATCCCCAGCATCAATCACCAGGAGCCCGATA 
AACGCTATGTGTGGAAGTGCCTCCCCCAGGGATTTGTGCTTAGCCCCTACATTTACCAGA 
AGACACTTCAAGAGATCCTCCAACCTTTCCGCGAA?^GATACCCAGAGGTTCAACTCTACC 
AATATATGGACGACCTGTTCATGGGGTCCAACGGGTCTAAGAAGCAGCACAAGGAACTCA 
TCATCGAACTGAGGGCAATCCTCCTGGAGAAAGGCTTCGAGACACCCGACGACAAGCTGC 
AAGAAGTTCCTCCATATAGCTGGCTGGGCTACCAGCTTTGCCCTGAAAACTGGAAAGTCC 
AGAAGATGCAGTTGGATATGGTCAAGAACCCAACACTGAACGACGTCCAGAAGCTCATGG 
GCAATATTACCTGGATGAGCTCCGGAATCCCTGGGCTTACCGTTAAGCACATTGCCGCAA 
» CTACAAAAGGATGCCTGGAGTTGAACCAGAAGGTCATTTGGACAGAGGAAGCTCAGAAGG 
AACTGGAGGAGAATAATGAAAAGATTAAGAATGCTCAAGGGCTCCAATACTACAATCCCG 
AAGAAGAAATGTTGTGCGAGGTCGAAATCACTAAGAACTACGAAGCCACCTATGTCATCA 
AACAGTCCCAAGGCATCTTGTGGGCCGGAAAGAAAATCATGAAGGCCAACAAAGGCTGGT 
CCACCGTTAAAAATCTGATGCTCCTGCTCCAGCACGTCGCCACCGAGTCTATCACCCGCG 
TCGGCAAGTGCCCCACCTTCAAAGTTCCCTTCACTAAGGAGCAGGTGATGTGGGAGATGC 
AAAAAGGCTGGTACTACTCTTGGCTTCCCGAGATCGTCTACACCCACCAAGTGGTGCACG 
ACGACTGGAGAATGAAGCTTGTCGAGGAGCCCACTAGCGGAATTACAATCTATACCGACG 
GCGGAAAGCAAAACGGAGAGGGAATCGCTGCATACGTCACATCTAACGGCCGCACCAAGC 
AAAAGAGGCTCGGCCCTGTCACTCACCAGGTGGCTGAGAGGATGGCTATCCAGATGGCCC 
TTGAGGACACTAGAGACAAGCAGGTGAACATTGTGACTGACAGCTACTACTGCTGGAAAA 
ACATCACAGAGGGCCTTGGCCTGGAGGGACCCCAGTCTCCCTGGTGGCCTATCATCCAGA 
ATATCCGCGAAAAGGAAATTGTCTATTTCGCCTGGGTGCCTGGACACAAAGGAATTTACG 
GCAACCAACTCGCCGATGAAGCCGCCAAAATTAAAGAGGAAATCATGCTTGCCTACCAGG 
GCACACAGATTAAGGAGAAGAGAGACGAGGACGCTGGCTTTGACCTGTGTGTGCCATACG 
ACATCATGATTCCCGTTAGCGACACAAAGATCATTCCAACCGATGTCAAGATCCAGGTGC 
CACCCAATTCATTTGGTTGGGTGACCGGAAAGTCCAGCATGGCTAAGCAGGGTCTTCTGA 
TTAACGGGGGAATCATTGATGAAGGATACACCGGCGAAATCCAGGTGATCTGCACAAATA 
TCGGCAAAAGCAATATTAAGCTTATCGAAGGGCAGAAGTTCGCTCAACTCATCATCCTCC 
AGCACCACAGCAATTCAAGACAACCTTGGGACGAAAACAAGATTAGCCAGAGAGGTGACA 
AGGGCTTCGGCAGCACAGGTGTGTTCTGGGTGGAGAACATCCAGGAAGCACAGGACGAGC 
ACGAGAATTGGCACACCTCCCCTAAGATTTTGGCCCGCAATTACAAGATCCCACTGACTG 
TGGCTAAGCAGATCACACAGGAATGCCCCCACTGCACCAAACAAGGTTCTGGCCCCGCCG 
GCTGCGTGATGAGGTCCCCCAATCACTGGCAGGCAGATTGCACCCACCTCGACAACAAAA 
TTATCCTGACCTTCGTGGAGAGCAATTCCGGCTACATCCACGCAACACTCCTCTCCAAGG 
AAAATGCATTGTGCACCTCCCTCGCAATTCTGGAATGGGCCAGGCTGTTCTCTCCAAAAT 
CCCTGCACACCGACAACGGCACCAACTTTGTGGCTGAACCTGTGGTGAATCTGCTGAAGT 
TCCTGAAAATCGCCCACACCACTGGCATTCCCTATCACCCTGAAAGCCAGGGCATTGTCG 
AGAGGGCCAACAGAACTCTGAAAGAAAAGATCCAATCTCACAGAGACAATACACAGACAT 
TGGAGGCCGCACTTCAGCTCGCCCTTATCACCTGCAACAAAGGAAGAGAAAGCATGGGCG 
GCCAGACCCCCTGGGAGGTCTTCATCACTAACCAGGCCCAGGTCATCCATGAAAAGCTGC 
TCTTGCAGCAGGCCCAGTCCTCCAAAAAGTTCTGCTTTTATAAGATCCCCGGTGAGCACG 
ACTGGAAAGGTCCTACAAGAGTTTTGTGGAAAGGAGACGGCGCAGTTGTGGTGAACGATG 
AGGGCAAGGGGATCATCGCTGTGCCCCTGACACGCACCAAGCTTCTCATCAAGCCAAACT 
GAACCCGGGGCGGCCGCTTCCCTTTAGTGAGGGTTAATGCTTCGAGCAGACATGATAAGA 
TACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGT 
GAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAAC 
AACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAA 
AGCAAGTAAAACCTCTACAAATGTGGTAAAATCCGATAAGGATCGATCCGGGCTGGCGTA 
ATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAAT 
GGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGAC 
CGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGC 
CACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATT 
TAGAGCTTTACGGCACCTCGACCGCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGG 
GCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAG 
TGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTT 
ATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAATW^TGAGCTGATTTAACAAATATT 
TAACGCGAATTTTAACAAAATATTAACGTTTACAATTTCGCCTGATGCGGTATTTTCTCC 
TTACGCATCTGTGCGGTATTTCACACCGCATACGCGGATCTGCGCAGCACCATGGCCTGA 
AATAACCTCTGAAAGAGGAACTTGGTTAGGTACCTTCTGAGGCGGAAAGAACCAGCTGTG 
GAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCA 
AAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGG 
CAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCC 
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GCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAAT 
TTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTG 
AGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTTGATTCTTCTGACACAACAGT 
CTCGAACTTAAGGCTAGAGCCACCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCG 
GCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCT 
GATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGAC 
CTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACG 
ACGG6CGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTG 
CTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAA 
GTATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCC6GCTACCTGCCCA 
TTCGACCACC7VAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTT 
GTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCC 
AGGCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGfGACCCATGGCGATGCCTGC 
TTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTG 
GGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTT 
GGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAG 
CGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAA 
TGACCGACCAAGCGACGCCCAACCTGCCATCACGATGGCCGCAATAAAATATCTTTATTT 
TCATTACATCTGTGTGTTGGTTTTTTGTGTGAATCGATAGCGATAAGGATCCGCGTATGG 
TGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCA 
ACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCT 
GTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCG 
AGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTT 
TCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTT 
TTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAA 
TTU^TATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTT 
TTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGAT 
GCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAG 
ATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTG 
CTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATA 
CACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGAT 
GGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCC 
AACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATG 
GGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAAC 
GACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACT 
GGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAA 
GTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCT 
GGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCC 
TCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGA 
CAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTAC 
TCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAG 
ATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCG 
TCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATC 
TGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAG 
CTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTC 
CTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATAC 
CTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACC 
GGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGT 
TCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGT 
GAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGC 
GGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTT 
TATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCA 
GGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTT 
TGCTGGCCTTTTGCTCACATGGCTCGACAGATCT 
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pONY8.3G FB29 ~ (SEQ ID NO:19) 

AGATXnTGAATAATAAAATCnXJTC3TTTGT0CGAAATACGCQT^^ 

GACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCGATAGAGATGGCGATATTGGAA 

AAATTGATAmGAAAATATGGCATATTGAAAATGTCGCCX3ATGTGAGTIT(^^ 

TGATATCGCCATTriTCCAAAAGTQATITITGGGCATAC^ 

TATATCGmACGGGGGATGGCGATAGACGACTTTGOTGACTTGGGCGAT^^ 

GCAAATATCGCAGTTTCGATATAGGTGACAGACGATATGAGGCTATATCGCCGATAGAGG 

CGACATCAAGCTGGCACATGGCCAATGCATATCGATCTATACATTGAATCAATATTGGCC 

ATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTQGCTAT^^ 

TACGTTGTATCCATATCGTAATATGTACATTTATATTGGCTCATGTCCAAC^ 

ATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCAT^^ 

TAOCCCATATATGGAGTTCCGCGTTACATAACnACGGTAAATGGCCCGOT 

GCCCAACOACCCCCGCXX^AmACGTCAATAATOACGTATGTTCC^^ 

AGGGACmCCATTGACGTCAATGGGTGGAGTAmACGGTAAACTGCCCAOTGGCAGT 

ACATCAAGTGTATCATATGCrAAGTCOKJCaXrrATTGA 

CGCCTOGCATTATGCCXJAOTACATGACaTACGGGACmC^^ 

CGTATTAGTCATCGCTATrACCATGGTOATGCOGTTTTGQCAGTACAC^ 

ATAGCGGTTTGACnCACGGGGATTTCCAAGTCTCCACCCCATTGACGTC 

GTTTTGGCACCAAAATCAACGGGACrrTCCAAAATGTCGTAACA^ 

CanrOACGCAAATGGGCGGTAQGCOTGTACCKJTGOGAGGTCT 

nAGTGAArcGGGCACTCAGATTCTGCGGTCTGAGTCCCnxn-CTGC^ 

CCTTTGTAATAAATATAATTCTCTACTCAGTCCCTGTCT^ 

CTACAGTTGGCGCCCGAACAGGOACCTGAGAGOGGCGCAGACCCTACCTGTTGA^^ 

CTGATCGTAGGATCCCCGGGACAGCAGAGGAGAACTTACAGAAGTXTT^ 

CTGGCCAGAACACAGGAGGACAGGTAAGATTGGGAGACXCnTrGACATTGGAGCAAGGCG 

CTCAAGAAGTTAGAGAAGGTGACGGTACAAGGGTCTCAGAAATTAACTACTGGTAACTGT 

AATTGGGCGCTAAGTCTAGTAaACTTATTTCATGATACCAAC^ 

TGGCAGCTGAGGGATGTCATT<XATTGCrGGAAGATGTAACTCAGACGCTQTCAGGA(^ 

GAAAGAGAGGCCTTTGAAAGAACATGGTGGGCAATTTCTGCTGTAAAGATGG^^ 

ATrAATAATOTAGTAGATGOAAAGGCATCATTCCAGCTCCTAAGAGCGAAATATGAAA^ 

AAGACTGCTAATAAAAAGCAQTCTGAGCCCTCTGAAGAATATCTCTAGAAC^^ 

CCCCGGGCTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTCGAGGCGGATCCGGCCAT 

TAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATO 

CGTTOTATCCATATCATAATATOTACATTTATATTGOCTCATOTCCA^^ 

GTTGACATTGATrATTGACTAGTTATTAATAGTAATCAATTACGGGGTCAmOTTCATA 

GCCCATATATGGAGTTCCGCGTTACATAACITACGGTAAATGGCCCGCCTGGCTGAC^^ 

CCAACGACCax:GCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCC^ 

GGACTTTCCATTGACQTCAATGGGTGGAGTATTTACXKn'AAACTGC^^ 

ATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCG 

CCTGGCATTATGCCCAGTACATGACCTTATGGGACmCXrrACTrGGCAGT^^ 

TATTAOTCATCGCTATTACCATGGTOATGCGGTTITG^ 

AGCGGmGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTC^^ 

mGGCACCAAAATCAACGGGACirrCCAAAATGTCGTAACAACTCCGCCCCA 

AAATGGGCGGTAGGCATGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACC 

GrcAGATCGCXn-GGAGACGCCATCCACGCTGrmGACXn-CCAT^^ 
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GATCCAGCCTCX:CK:GGCCCCAAGCTrGTTGGGATCCACCGGTCGCCACCATGGTGA 

gggcgaggagctgttcacxwgggtggtgcccatcctggtcgagcto 

cggccacaagttcagcgtgtccggcgagggcgagggcgatgccacctacggcaagctgac 

cctgaagttcatctgcaccaccggcaagctgcccgtgccctggcccaccctcgtgaccac 

cctgacctacogcgtgcagtgcttcagcmn'accccgaccacatgaagcagcac^^ 

cncaagtcok^catgcccgaagqctacgtccaggagcgcaccatot 

cggcaactacaagacccxk:gccgaggtgaagttcgagggcgacaccctggtgaaccgcat 

CGAGCTGAAGGGCATCQACTTCAAGGAGGACGGCAACATCCTOGGGCACAAGCT^^ 

CAACTACAACAGCCACAAajTCTATATCATGOCCOACA^ 

GAACTTCAAGATtXGCCACAACATCGAGOACXKKIAGCGTGCAGC^ 

GCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTAC^ 

CCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTT 

CGTGACXXKXX5C(XKK3ATCACTCTCGGCATGGACGAGCT^ 

CrCTAGAGTCGACCrrGCAGGCATGCAAGCTTCAGCrrGCTCGAGGGGGGGa:C^ 

GCXmGTTCCCmAGTGAGGGTTAATTGCGCGGGAAGTAmAT^^ 

AAGTAATACATGAGAAACTmACTACAGCAAGCACAATCCTCCAAAAi^ 

ACAAAATCCCTGQTGAACATOATTGGAAGGGACXn'ACT^ 

GTGCAGTAGTAGTTAATGATGAAGGAAAGGGAATAATTGCTGTACCATTAACCAGGACT^ 

AGmCTAATAAAACCAAATTGAGTATTGTTGCAGGAAGCAAGACCCAACT^ 

AGCTGTOTTTCCTCACCnrAATATTTGTTATAAGGm 

TCAACCCCTATTACCCAACAGTCAGAAAAATCTAAGTGTGAGGAGAACACAAT0 

CCTTATTGTTATAATAATGACAGTAAGAACAGCATGGCAGAATCGAAGGAAGCAAGAGAC 

CAAGAATGAACCTOAAAGAAGAATCTAAAGAAGAAAAAAGAAGAAATGACTGGTGGAA^ 

TAGGTATGTTTCTGTTATOCTTAGCAGGAACTACTGGAGOAATAC^ 

GACTCCCACAGCAACATTATATAGGGTTGGTGGCGATAGGGGGAAGATTAAACGGATCT^ 

GCCAATCAAATGCTATAGAATGCTGGGGTTCCTT(XCXK3GGTGTAGAa:Am 

ACTTCAGTTATOAGACCAATAGAAGCATOCATATOOATAATAATACTGCTACATTA^^ 

AAGCriTAACCAATATAACTGCTCTATAAATAACAAAACAOAATTAGAAA^ 

AGTAAAGACnrCTGGCATAACTCCmACCTATTTCTTCT^ 

TAGACATAAGAGAGATmGGTATAAGTGCAATAGTOGCAGCTATTGTAGCCGCTACTGC 

TArrGCTGCTAGCGCTACTATGTCnTATGTTGCTCTAACTO 

AGTACAAAATCATACTTTTGAGGTAGAAAATAGTACTCTAAATGGTATGGAm 

ACGACAAATAAAGATATTATATGCTATGATTOTCAAACACATGCAGATGTTCAACTGT^ 

AAAQGAAAGACAACAGGTAOAOOAGACX\mAAmAATTGGATGTATAGAAAGAACACA 

TGTATTTTGTCATACTGGTCATCCCTGGAATATGTCATGGGOACATTTA^ 

ACAATGGGATGACTGGGTAAGCAAAATGGAAGATTTAAATCAAGAGATACTAACTACACT 

TCATQOAGCCAGGAACAATTTGGCACAATCCATGATAACATTCAATACACCAGA^^^ 

AGCTCAAmGGAAAAGAarmGGAOTCATATTGOAAATO 

TTCCATTATAAAATATATAGTGATGTTmGCTTAmATITGTTACT^^ 

TAAGATCCrcAGGGCCOCTGGAAGGTGACCAGTGGTGCAGGGTCCrCTO 

CXTGAAGAAAAAATTCCATCACAAACATGCATCGCGAGAAOACACCTGGGACCAGGCCCA 

ACACAACATACACCTAGCAGGCGTGACCX3GTGGATCAGGGGACAAATACTACAAGCAGAA 

GTACTCCAGGAACGACTGGAATGGAGAATCAGAGGAGTACAACAGGCGGCCAAAGAGCTG 

GGTGAAGTCAATCGAGGCAmGGAGAGAQCTATATTTCCGAGAAGACCAAAGGGGAGAT 

TTCTCAGCCTGGGGCGGCTATCAACGAGCACAAGAACGGCTCTGQGGGG^ 

CCAAGGGTCCTTAGACCTGGAGATTCGAAGCGAAGGAGGAAACAmATGACTGTTGCAT 

TAAAQCCCAAGAAGGAACrCTCGCTATCCCrrGCTGTGOATTTCCCTTATGG 
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GGGACrAGTAATrATAGTAGGACCK:ATAGCAGGCTATGGATTACGTGGACTCGCTGTrAT 

AATAAGGATTTGTATTAOAGGmAAATTTGATATTTGAAATAATCAGAA^ 

TTATATTGGAAGAGCrmAAATCCTGGCACATCTCATGTATCAATGCCTC^^ 

GAAAAACAAGGGGGGAACTGTGGGGTTmATGAGGGGTTTTATAAATGATTATAAGAGT 

AAAAAGAAAGTTGCTGATGCTCTCATAACCTTGTATAACCCAAAGGACTAGCTC^^ 

CTAGGCAACTAAA(XGCAATAACCGCATTTGTGACGCGAOTTCCC^ 

AACTTCCTGTTmACAGTATATAAGTGCTTGTATTCTGACAATTGGGCACTC^^ 

GCGGTCTGAGTCCCrrcrrCTGCTGGOCTGAAAAGGCCTTTGTAATi^ 

CTCAGTCCCTGTOCTAGmGTCTGTTCGAGATU^ 

TCATGGTCATAGCTGTTTCCTOTGTGAAATTOTTATCCGCrc 

CGAGCCGGAAGCATAAAGTGTAAAGCCTGGGOTGCCTAATGAGTGAGCTAACTCACATTA 

AnGCGTTGCGCTCACTGCCCGCmCCAGTCGGGAAACCTGTCOTGCCA^ 

GGCGGCCQAOGOKSCCTACGTOAACCATCACXXIAAATC^ 

CGTAAAGCTCTAAATCGGAACCCTAAAGGGAGCCCCOJAmAGAGOTGACGGGGA^ 

CCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGCTG 

GCAAOTGTAGCGOTCACGCTGCGCGTAACCACCACACCCGCCGCGCITA^^ 

CAGGGCGCGTCCATTCGCCATTCAGOCTOCGCAACTGTTGGOAAGG^ 

CCTCTTCGCTATTACGCCAGCCCGGATCGATCCTTATCGGA 

GTmACTTGCTTTAAAAAACCTCCCACATCTCCCCC^^ 

GCAATrGTTGTTGTTAACTTGTTTATrGCAGCTTAT^ 

ATCACAAATTTCACAAATAAAGCATTTTmCACTGCATTCTAGTTGT^^ 

CTCATCAATGTATCTTATCATGTCrOCTCGAAGCATTAACCCTCACTAA^ 

CGCCCGGOTCGACTTCACAGGTQTTTGaKK:GTCT^^ 

COGGGGCTGCTCTGCTCGCCCCACAGCCnTrCTTGTGC^^ 

GAGAAATCGCCCCTCTGGTCCTCGCGGAAGTAGAGCTCCCTCCAGATGCCGCGATTCACC 
TCTCCCAGCTCTTrAGCGGCrrGTTGCACGCCCCrAATTCTCX:^ 

aggacctcggotgcaaaatctggccx:ctaat^ 
tgggtgggaccggggccgaggtgtcntctggcgatgcaggtctggcrraggaat^^ 

TCGGGCAGGGACTGTCTCAGCACXK:GGCACCACTGGTCCCanx:CAM^ 

TroATCTT(X:ACX:AGTCGTTGCXKKXXn^^ 

TCTTGATCCCTGGCCTCCmXKnCTCAGCCAT^ 

GGTGGTGGGTCGGTGGTCCCTGGGCAGGGGTCTCCAGATCCCGGACGAGCCCCCAAATGA 
AAGACCCCCGAGACGGGTAOTCAATCACTCTGAGGAGACCCTCCCAAQGAACAGCGAGAC 
CACGAGTCOGATGCAACAGCAAGAGGATTTATTOGATACACGGGTACCC^^ 
TCTATCGGAGGACTGGCGCGCCGAGTGAGGGGTTGTGAGCTCrm 

GCAGAAGCGCGCGAACAGAAGCGAGAAGCAGGCTGATTGGTTAATTCAAATAAGGCACAG 

GGTCAmCAGGTCCTTGGGGGAGCCTGGAAACATCTGATGGGTCTTAAGA^ 

GGGTTGGGCCATATCTGGGGACCATCTGTTCnTGGCaXXS^ 

GACCATCTGTTOTGGCCCCGGGCCGGGGCCGAAACTGCTCACCGCAGATATCCTGm^ 

OCCCAACGrtAGCTGTTTTCGTGTACCCGCCCTTGATCT^ 

OGTATTTTTCCATGCXriTGCAAAATGGCGTrACTGCGGCT 

ATCTGGCCGAGGCGGCCTACTCTGCATTAATGAATCGGCCMCGCWCGGGGA 

TTGCGTATTGGGCGCTCTTCCGCTTCCn'CGCTCACTGACTCGCT 

CTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAG^^ 

OATAACGCAGGAAAGAACATGTATAACTTCGTATAATGTATGCTATA^ 

GTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTrm 

CCATAGGCrCCGCCCCCCTGACGAGCATCACAAAAATXX}ACGCTCAAGTCAGAG^^ 
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AAACCCGACA<KJACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGC^^ 

TCCTGTTOCGACCCTGCCGCTTACCGGATACCTGTCCGa^^ 

GGCGCmCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGC^^ 

GCTQGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATO 

TOmnTGAGTCCAACCCGGTAAGACACGACTTATC^ 

CAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCriTGAAGTOCTG^^ 

CTACGGCrACACrAGAAGGACAGTATTTGGTATCrGCGCTCTGCTGAAGCC^^ 

CXKJAAAAAGAGTTGGTAGCTOTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGm 

TrnGmGCAAGCAQCAGATTACGCGCAOAAAAAAAGGATCTCAAGAA^ 

CrmCTACGGGGTCTGACGCrcAGTGGAACGAAAACTCACGTTA^ 

GAOATTATCAAAAAGGATCirCACCTAGATOriTrTAAAT^ 

AATCTAAAOTATATATGAOTAAAOTGGTCTGACAGmcCAATGCTT^^ 

ACCTATCTCAGCGATCTGTCTAmCOTTCATCCATAGTTGCCT 

GATAACTACGATACGGOAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGA 

CCCACGCTCAOXJGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGA^^ 

CAGAAQTGGTCXITGCAACTTTATCXXSCXrrrc 

TAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTnm- 

CGTGGTGTCAC^CTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCC^ 

GOTAGTTACATGATCCXXXATGTTGTGCAAAAAAGCGGTTAGCTCOT 

CGTTGTCAGAAGTAAOTTOGCCGCAGTGTTATCACTCATOOmTGGCAGCA 
TTCTCTTACTGTCATGCCATCCGTAAGATGCT^^ 

GTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCrCrrGCCCGGCGrcA^^ 
TAATACCGCGCCACATAGCAGAACTTTAAAAOTGCTCATCAnTO 
GCGAAAACTCTCAAGGATCrrACCGCrGTTGAGATO:AOTTCX?AT^ 
ACCCAACTGATCTTCAGCATCrmACTTTC 

AAGGCAAAATGCCGCVVAAAAAGGGAATAAGGGCGACACGOAAATGT^^ 

CTTCCi 1 1 1 iCAATAmTTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACAT 

AmGAATGTAmAGAAAAATAAACAAATAGGGGrrCCGCGCACATTrcrc^ 

GCCACCTAAATTGTAAGCGTTAATATTTTGTTAAAATTCGCGTTAAAT^ 

AGCTCATTTmAACCAATAGGOCGAAATCGGCAAA^ 

ACCGAGATAGGGTTGAGTGTTGTTCCAGTTTGGAACAAOAGTCCACTATTAAA 

GACTCCAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGATAAC 

TTCGTATAATGTATGCTATACGAAGmTCACTACGTOAACCArcACCC^^ 

TTTGGGGrcGAGGTCCCOTAAAGCACTAAATaKJAACCCTAAA^^ 

AGOTGACGGGGAAAGCCAACCTGGCTTATCGAAATTAATACGACT^ 

CGGC 
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